| Bill Allombert on Mon, 12 Jan 2004 23:44:27 +0100 |
[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]
| some PARI kernel oddities/remarks |
Hello PARI-dev,
While trying to understand why Flx_sqr was so slow,
I found some oddities in the kernel:
1) We have two addshiftw function, one in static in the kernel
but not the other, so it is not really a bug, but probably one is
misnommed
2) level1.h include a adduumod_noofl function biut no prototype
and it seems to not being used anywhere. Do we really want that
function ? If yes we could use it in Flx_addspec if p&HIGHBIT==0
3) sqrispec include the following:
#if 0
c1 = shifti(muliispec(a0,a, n0a,na),1);
#else /* slower !!! */
t = addiispec(a0,a,n0a,na);
t = sqrispec(t+2,lgefint(t)-2);
c1= addiispec(c0+2,c+2, lgefint(c0)-2, lgefint(c)-2);
c1= subiispec(t+2, c1+2, lgefint(t)-2, lgefint(c1)-2);
#endif
meaning it is faster to compute (a+b)^2-a^2-b^2 than 2*a*b.
I checked that with my simple benchmark and it really make a large
difference.
Now, It tried to implement the same trick with Flx_sqrspec but there
it make things much slower...
4) In the (a+b)^2-a^2-b^2=2*a*b formula, we can detect high word
cancellation when a and are of different length (which can happen).
Is it wortwhile to implement it for t_INT ? for Flx ?
5) Is floating-point inverse or Montgomery inverse useful in this
context ?
NTL use floating-point inverse:
double ninv=1/(double)n;
/* Compute a*b%n with 0<= a,b <n */
long MulMod(long a, long b, long n, double ninv)
{
double ab;
long q, res;
ab = ((double) a) * ((double) b);
q = (long) (ab*ninv); // q could be off by (+/-) 1
res = (long) (ab - ((double) q)*((double) n));
if (res >= n)
res -= n;
else if (res < 0)
res += n;
return res;
}
I suppose I will test NTL with my benchmark to start with.
Cheers,
Bill.