Ilya Zakharevich on Fri, 16 Aug 2024 01:45:49 +0200


[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

2.16.2-beta: 2.45× slowdown in forprime()


Comparing to the older versions, forprime() is perfectly OK at least
until 10^17.  However, for 10^18 it slows down 2.45× times:

  parisize = 10000000000, primelimit = 1100000000, factorlimit = 1048576
  
  (15:52) gp > my(c, L=10^8, k=18); forprime(p=10^k,10^k+L,c++); c
  time = 21,706 ms.
  %3 = 2414886

With older versions, and/or without high primelimit, the timing is about
8.86.

  (Of course, without high primelimit, the timing for smaller values slows down…)

The CPU is:
  powdermilk:/scratch/->lscpu -C | m
  NAME ONE-SIZE ALL-SIZE WAYS TYPE        LEVEL  SETS PHY-LINE  COHERENCY-SIZE
  L1d       48K     544K   12 Data            1    64        1  64
  L1i       32K     704K    8 Instruction     1    64        1  64
  L2       1.3M    11.5M   10 Unified         2  2048        1  64
  L3        24M      24M   12 Unified         3 32768        1  64

WARNING: this is a hybrid CPU.  13th Gen Intel(R) Core(TM) i5-13500T.

  The active size of the table of primes for 10^18 is π(10⁹) = 50,847,534
  “units”.  For 10^17 it is π(√10¹⁷) = 17,082,666 “units”.

With older PARIs, the speed decreases to about 2*10^17, then practically
stabilizes at about 8.8 sec per 10^8 numbers.  Contrary to this, With
newer PARIs, it slows down uniformly in the interval 10^17 .. 10^18:

  (16:35) gp > timeit(s,f) = my(t=gettime(), o=f()); if(#o, o=Str("\t",o)); printf("%s%7.3f%s\n",s,gettime()/1000,o);
  (16:37) gp > my(D=17, c, L=10^8); for(k=1,10, c=0; timeit(Strprintf("%d*10^%d\t",k, D), (()->forprime(p=k*10^D,k*10^D+L,c++);c)))
    1*10^17   7.584 2555873
    2*10^17  10.363 2509779
    3*10^17  12.442 2484002
    4*10^17  14.224 2466309
    5*10^17  15.748 2453186
    6*10^17  17.127 2444935
    7*10^17  18.369 2431918
    8*10^17  19.567 2426520
    9*10^17  20.673 2419632
    10*10^17 21.692 2414886
    
Hope this helps,
Ilya