Karim BELABAS on Fri, 4 Apr 2003 00:18:34 +0200 (MEST) |
[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]
Re: polredabs() again |
On Thu, 3 Apr 2003, Bill Allombert wrote: > On Wed, Mar 12, 2003 at 03:09:12PM -0500, Igor Schein wrote: >> Hi, >> >> I know there're too many things going on at the same time right now, >> but I'll still gonna mention this example of polredabs() inefficiency: >> >> polredabs(x^8-97418273277417164826810*x^4-97418273278115084139045*x^2+97418273278115084139045); >> >> It takes extremely long time. Basically, I'd like to see either a >> quick fix ( if possible ) or an acknowledgement of the issue in TODO >> file :). I know we've discussed this with Karim in private emails, >> but it never made it into TODO. > > [I was expecting Karim to comment but...] Hiya... > Well, the underlying algorithm has a high complexity, so it is not clear > it is a defficiency in PARI or even that it is fixable, if we insist > to have the 'absolutly smallest polynomial'. Let me describe precisely the current algorithm (for future reference, and to spur ideas:-). 1) Compute an (T2-)LLL-reduced basis for the maximal order [ fast and reliable nowadays, compared even to stable version. This is especially efficient when the field is totally real. In this case, the T2 matrix is integral, and Schnorr-Euchner strategy is used on an _exact_ matrix, at the lowest possible accuracy. So one sees many "precision increase" at \g4, but it's a feature of the algorithm, and the signe we are making good progress: whenever precision remains "stuck", we are in a quandary ] 2) Enumerate [Fincke-Pohst] all vectors smaller than the _best bound found so far_ (that's an ellipsoid, for a fixed bound). The latter is defined as the T2-norm of the smallest element defining the original field [ checked by computing explicitly the characteristic polynomial C, in terms of the (known) embeddings, then gcd(C,C'): this is fast, and much more reliable than checking whether the roots of C are close to each other (which would be faster) ] The _best bound_ is initialized using the norm of the smallest suitable LLL-basis vectors [ it's a famous "conjecture" that one of them will be suitable, cf. basemath/base1.c:1378: if (!_polred(x, a, NULL, &chk)) err(talker,"you found a counter-example to a conjecture, please report!"); ] 3) To avoid computing too many char. pol., this mode is switched off after a number of consecutive failures (currently, 5): there are presumably many subfields, and we expect many other failures. In this case, the small vectors are cached until a buffer is filled. Then they are sorted wrt T2-norm, and the characteristic polynomials are computed (smallest elements first) [ you get the dreaded "sorting..." message at \g4. If we're unlucky, we compute all characteristic polynomials, find nothing here, and discard the lot. ]. [ the implementation is generic, smallvectors() actually takes as arguments a lattice, a bound, the size of the small vectors cache, and pointers to "check routines" which define what is a "suitable vector" ] 4) If we are lucky, the _best bound so far_ decreases fast, and we avoid most of the enumeration. 5) When computing the initial _best norm_, I compute the number field generated by the initial LLL-basis vectors [ succesive compositum(), that's when you get the "subfields" messages. This requires factoring huge polynomials over Z. That was my original motivation for studying van Hoeij's algorithm: this is relavely cheap now ]. The enumeration stops when this tells us that the linear span of remaining basis vectors is contained in a strict subfield. If we're lucky, this may prune out a huge part of the enumeration tree. 6) I have neglected the very important '16' flag: replace "maximal order" by "some order which is cheap to compute" in the description above [ factor the discriminant up to 'primelimit' only: the order is maximal at all primes up to primelimit, not maximal at all larger primes occuring to an odd power in the discriminant of the original generating polynomial (it is in general maximal when the valuation is even, without guarantee) ]. Three obvious (trivial) improvements: * For some crazy reason, the enumeration backtracking is done in _decreasing_ (lexicographic) order. So that the vectors most susceptible to have large norm are considered first. It is easy to change, but I never took the time to do it [ have a look at the smallvectors() code to understand why ] * Order the LLL-basis vectors, so as to maximize the length of the initial segment of the basis generating a strict subfield as in 5) [ currently, we use Fincke-Pohst ordering, which minimizes the number of enumerated vectors for a _fixed_ bound ]. Doing this efficiently is an interesting problem. * Use random samples to try and improve quickly on the _best bound_. Now the bad news: if there are many subfields, providing lots of small elements, this will still take a lot of time. It can't be helped. On the other hand, the current specification of polredabs is quite useless. There's no application whatsoever for a "polynomial of absolute smallest T2-norm". It's not even guaranteed to have minimal discriminant, or to yield smallest coefficients. The only one I can see is to give a pseudo-canonical representative for the field (this helps table builders, less isomorphism tests...) In any case, there's an obvious solution: add a further optional argument to polredabs, to sample up to a certain recursion depth (e.g [N, B] would include at most N vectors in any linear combination, with coefficients < B in absolute value, or the standard computed bound, whichever is lowest). And return the best polynomial found [ possibly the original polynomial ] Then, we'd have a guaranteed time limit on polredabs() operation, and a guarantee to get a "sensible" polynomial, if not a "canonical" one. Cheers, Karim. -- Karim Belabas Tel: (+33) (0)1 69 15 57 48 Dép. de Mathématiques, Bât. 425 Fax: (+33) (0)1 69 15 60 19 Université Paris-Sud http://www.math.u-psud.fr/~belabas/ F-91405 Orsay (France) http://www.parigp-home.de/ [PARI/GP]