Code coverage tests

This page documents the degree to which the PARI/GP source code is tested by our public test suite, distributed with the source distribution in directory src/test/. This is measured by the gcov utility; we then process gcov output using the lcov frond-end.

We test a few variants depending on Configure flags on the pari.math.u-bordeaux.fr machine (x86_64 architecture), and agregate them in the final report:

with GMP kernel
with GMP kernel using --mt=pthread
with native kernel, including micro-assembler code
with native kernel, without micro-assembler
with native kernel, without micro-assembler, disabling GCC extensions (DISABLE_INLINE)
with GMP kernel, emulating an x86_32 architecture at Configure time via setarch

The target is to exceed 90% coverage for all mathematical modules (given that branches depending on DEBUGLEVEL or DEBUGMEM are not covered). This script is run to produce the results below.

LCOV - code coverage report

Current view:	top level - basemath - mpqs.c (source / functions)		Hit	Total	Coverage
Test:	PARI/GP v2.18.1 lcov report (development 30733-eb2d773d38)	Lines:	695	841	82.6 %
Date:	2026-03-03 09:24:12	Functions:	34	38	89.5 %
Legend:	Lines: hit not hit

          Line data    Source code

       1             : /* Copyright (C) 2000  The PARI group.
       2             : 
       3             : This file is part of the PARI/GP package.
       4             : 
       5             : PARI/GP is free software; you can redistribute it and/or modify it under the
       6             : terms of the GNU General Public License as published by the Free Software
       7             : Foundation; either version 2 of the License, or (at your option) any later
       8             : version. It is distributed in the hope that it will be useful, but WITHOUT
       9             : ANY WARRANTY WHATSOEVER.
      10             : 
      11             : Check the License for details. You should have received a copy of it, along
      12             : with the package; see the file 'COPYING'. If not, write to the Free Software
      13             : Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA. */
      14             : 
      15             : /* Self-Initializing Multi-Polynomial Quadratic Sieve, based on code developed
      16             :  * as part of the LiDIA project.
      17             :  *
      18             :  * Original version: Thomas Papanikolaou and Xavier Roblot
      19             :  * Extensively modified by The PARI group.
      20             :  * Support for class group computations by Bill Allombert */
      21             : /* Notation commonly used in this file, and sketch of algorithm:
      22             :  *
      23             :  * Given an odd integer N > 1 to be factored, we throw in a small odd squarefree
      24             :  * multiplier k so as to make kN = 1 mod 4 and to have many small primes over
      25             :  * which X^2 - kN splits.  We compute a factor base FB of such primes then
      26             :  * look for values x0 such that Q0(x0) = x0^2 - kN can be decomposed over FB,
      27             :  * up to a possible factor dividing k and a possible "large prime". Relations
      28             :  * involving the latter can be combined into full relations which don't; full
      29             :  * relations, by Gaussian elimination over F2 for the exponent vectors lead us
      30             :  * to an expression X^2 - Y^2 divisible by N and hopefully to a nontrivial
      31             :  * splitting when we compute gcd(X + Y, N).  Note that this can never
      32             :  * split prime powers.
      33             :  *
      34             :  * Candidates x0 are found by sieving along arithmetic progressions modulo the
      35             :  * small primes in FB and evaluation of candidates picks out those x0 where
      36             :  * many of these progressions coincide, resulting in a highly divisible Q0(x0).
      37             :  *
      38             :  * The Multi-Polynomial version improves this by choosing a modest subset of
      39             :  * FB primes (let A be their product) and forcing these to divide Q0(x).
      40             :  * Write Q(x) = Q0(2Ax + B) = (2Ax + B)^2 - kN = 4A(Ax^2 + Bx + C), where B is
      41             :  * suitably chosen.  For each A, there are 2^omega_A possible values for B
      42             :  * but we'll use only half of these, since the other half is easily covered by
      43             :  * exploiting the symmetry x -> -x of the original Q0. The "Self-Initializating"
      44             :  * bit refers to the fact that switching from one B to the next is fast, whereas
      45             :  * switching to the next A involves some recomputation (C is never needed).
      46             :  * Thus we quickly run through many polynomials sharing the same A.
      47             :  *
      48             :  * The sieve ranges over values x0 such that |x0| < M  (we use x = x0 + M
      49             :  * as array subscript).  The coefficients A are chosen so that A*M ~ sqrt(kN).
      50             :  * Then |B| is bounded by ~ (j+4)*A, and |C| = -C ~ (M/4)*sqrt(kN), so
      51             :  * Q(x0)/(4A) takes values roughly between -|C| and 3|C|.
      52             :  *
      53             :  * Refinements. We do not use the smallest FB primes for sieving, incorporating
      54             :  * them only after selecting candidates).  The substitution of 2Ax+B into
      55             :  * X^2 - kN, with odd B, forces 2 to occur; when kN is 1 mod 8, it occurs at
      56             :  * least to the 3rd power; when kN = 5 mod 8, it occurs exactly to the 2nd
      57             :  * power.  We never sieve on 2 and always pull out the power of 2 directly. The
      58             :  * prime factors of k show up whenever 2Ax + B has a factor in common with k;
      59             :  * we don't sieve on these either but easily recognize them in a candidate. */
      60             : 
      61             : #include "paricfg.h"
      62             : #ifdef HAS_SSE2
      63             : #include <emmintrin.h>
      64             : #endif
      65             : 
      66             : #include "pari.h"
      67             : #include "paripriv.h"
      68             : 
      69             : #define DEBUGLEVEL DEBUGLEVEL_mpqs
      70             : 
      71             : /** DEBUG **/
      72             : /* #define MPQS_DEBUG_VERBOSE 1 */
      73             : 
      74             : /* Faster but slowdown hnfadd too much */
      75             : /* #define CLASSGROUP_LARGE_PRIME */
      76             : #include "mpqs.h"
      77             : 
      78             : #define REL_OFFSET 20
      79             : #define REL_MASK ((1UL<<REL_OFFSET)-1)
      80             : #define MAX_PE_PAIR 60
      81             : 
      82             : #ifdef HAS_SSE2
      83             : #define EXT0(a) ((ulong)__builtin_ia32_vec_ext_v2di((__v2di)(a), 0))
      84             : #define EXT1(a) ((ulong)__builtin_ia32_vec_ext_v2di((__v2di)(a), 1))
      85             : #define TEST(a) (EXT0(a) || EXT1(a))
      86             : typedef __v2di mpqs_bit_array;
      87             : const mpqs_bit_array mpqs_mask = { (long) 0x8080808080808080L, (long) 0x8080808080808080UL };
      88             : #else
      89             : /* Use ulong for the bit arrays */
      90             : typedef ulong mpqs_bit_array;
      91             : #define TEST(a) (a)
      92             : 
      93             : #ifdef LONG_IS_64BIT
      94             : const mpqs_bit_array mpqs_mask = 0x8080808080808080UL;
      95             : #else
      96             : const mpqs_bit_array mpqs_mask = 0x80808080UL;
      97             : #endif
      98             : #endif
      99             : 
     100       25440 : static GEN rel_Y(GEN c) { return gel(c,1); }
     101       25440 : static GEN rel_p(GEN c) { return gel(c,2); }
     102             : 
     103             : static void
     104      388804 : frel_add(hashtable *frel, GEN R)
     105             : {
     106      388804 :   ulong h = hash_GEN(R);
     107      388804 :   if (!hash_search2(frel, (void*)R, h))
     108      388797 :     hash_insert2(frel, (void*)R, (void*)1, h);
     109      388804 : }
     110             : 
     111             : /*********************************************************************/
     112             : /**                         INITIAL SIZING                          **/
     113             : /*********************************************************************/
     114             : /* # of decimal digits of argument */
     115             : static long
     116        9872 : decimal_len(GEN N)
     117        9872 : { pari_sp av = avma; return gc_long(av, 1+logint(N, utoipos(10))); }
     118             : 
     119             : /* To be called after choosing k and putting kN into the handle:
     120             :  * Pick up the parameters for given size of kN in decimal digits and fill in
     121             :  * the handle. Return 0 when kN is too large, 1 when we're ok. */
     122             : static int
     123        3307 : mpqs_set_parameters(mpqs_handle_t *h)
     124             : {
     125             :   long s, D;
     126             :   const mpqs_parameterset_t *P;
     127             : 
     128        3307 :   h->digit_size_kN = D = decimal_len(absi(h->kN));
     129        3307 :   if (D > MPQS_MAX_DIGIT_SIZE_KN) return 0;
     130        3307 :   P = &(mpqs_parameters[maxss(0, D - 9)]);
     131        3307 :   h->tolerance   = P->tolerance;
     132        3307 :   h->lp_scale    = P->lp_scale;
     133             :   /* make room for prime factors of k if any: */
     134        3307 :   h->size_of_FB  = s = P->size_of_FB + h->_k->omega_k;
     135             :   /* for the purpose of Gauss elimination etc., prime factors of k behave
     136             :    * like real FB primes, so take them into account when setting the goal: */
     137        3307 :   h->target_rels = (s >= 200 ? s + 10 : (mpqs_int32_t)(s * 1.05));
     138        3307 :   h->M           = P->M;
     139        3307 :   h->omega_A     = P->omega_A;
     140        3307 :   h->no_B        = 1UL << (P->omega_A - 1);
     141        3307 :   h->pmin_index1 = P->pmin_index1;
     142             :   /* certain subscripts into h->FB should also be offset by omega_k: */
     143        3307 :   h->index0_FB   = 3 + h->_k->omega_k;
     144        3307 :   if (DEBUGLEVEL >= 5)
     145             :   {
     146           0 :     err_printf("MPQS: kN = %Ps\n", h->kN);
     147           0 :     err_printf("MPQS: kN has %ld decimal digits\n", D);
     148           0 :     err_printf("\t(estimated memory needed: %4.1fMBy)\n",
     149           0 :                (s + 1)/8388608. * h->target_rels);
     150             :   }
     151        3307 :   return 1;
     152             : }
     153             : 
     154             : /*********************************************************************/
     155             : /**                       OBJECT HOUSEKEEPING                       **/
     156             : /*********************************************************************/
     157             : 
     158             : /* factor base constructor. Really a home-grown memalign(3c) underneath.
     159             :  * We don't want FB entries to straddle L1 cache line boundaries, and
     160             :  * malloc(3c) only guarantees alignment adequate for all primitive data
     161             :  * types of the platform ABI - typically to 8 or 16 byte boundaries.
     162             :  * Also allocate the inv_A_H array.
     163             :  * The FB array pointer is returned for convenience */
     164             : static mpqs_FB_entry_t *
     165        3307 : mpqs_FB_ctor(mpqs_handle_t *h)
     166             : {
     167             :   /* leave room for slots 0, 1, and sentinel slot at the end of the array */
     168        3307 :   long size_FB_chunk = (h->size_of_FB + 3) * sizeof(mpqs_FB_entry_t);
     169             :   /* like FB, except this one does not have a sentinel slot at the end */
     170        3307 :   long size_IAH_chunk = (h->size_of_FB + 2) * sizeof(mpqs_inv_A_H_t);
     171        3307 :   h->FB = (mpqs_FB_entry_t *) stack_malloc_align(size_FB_chunk, 64);
     172        3307 :   h->inv_A_H = (mpqs_inv_A_H_t *) stack_malloc_align(size_IAH_chunk, 64);
     173        3307 :   return h->FB;
     174             : }
     175             : 
     176             : /* sieve array constructor;  also allocates the candidates array
     177             :  * and temporary storage for relations under construction */
     178             : static void
     179        3307 : mpqs_sieve_array_ctor(mpqs_handle_t *h)
     180             : {
     181        3307 :   long size = (h->M << 1) + 1;
     182        3307 :   mpqs_int32_t size_of_FB = h->size_of_FB;
     183             : 
     184        3307 :   h->sieve_array = (unsigned char *) stack_calloc_align(size, sizeof(mpqs_mask));
     185        3307 :   h->sieve_array_end = h->sieve_array + size - 2;
     186        3307 :   h->sieve_array_end[1] = 255; /* sentinel */
     187        3307 :   h->candidates = (long *)stack_malloc(MPQS_CANDIDATE_ARRAY_SIZE * sizeof(long));
     188             :   /* whereas mpqs_self_init() uses size_of_FB+1, we just use the size as
     189             :    * it is, not counting FB[1], to start off the following estimate */
     190        3307 :   if (size_of_FB > MAX_PE_PAIR) size_of_FB = MAX_PE_PAIR;
     191             :   /* and for tracking which primes occur in the current relation: */
     192        3307 :   h->relaprimes = (long *) stack_malloc((size_of_FB << 1) * sizeof(long));
     193        3307 : }
     194             : 
     195             : /* allocate GENs for current polynomial and self-initialization scratch data */
     196             : static void
     197        3307 : mpqs_poly_ctor(mpqs_handle_t *h)
     198             : {
     199        3307 :   mpqs_int32_t i, w = h->omega_A;
     200        3307 :   h->per_A_pr = (mpqs_per_A_prime_t *)
     201        3307 :                 stack_calloc(w * sizeof(mpqs_per_A_prime_t));
     202             :   /* A is the product of w primes, each below word size.
     203             :    * |B| <= (w + 4) * A, so can have at most one word more
     204             :    * H holds residues modulo A: the same size as used for A is sufficient. */
     205        3307 :   h->A = cgeti(w + 2);
     206        3307 :   h->B = cgeti(w + 3);
     207       14064 :   for (i = 0; i < w; i++) h->per_A_pr[i]._H = cgeti(w + 2);
     208        3307 : }
     209             : 
     210             : /*********************************************************************/
     211             : /**                        FACTOR BASE SETUP                        **/
     212             : /*********************************************************************/
     213             : /* fill in the best-guess multiplier k for N. We force kN = 1 mod 4.
     214             :  * Caller should proceed to fill in kN
     215             :  * See Knuth-Schroeppel function in
     216             :  * Robert D. Silverman
     217             :  * The multiple polynomial quadratic sieve
     218             :  * Math. Comp. 48 (1987), 329-339
     219             :  * https://www.ams.org/journals/mcom/1987-48-177/S0025-5718-1987-0866119-8/
     220             :  */
     221             : static ulong
     222        3279 : mpqs_find_k(mpqs_handle_t *h)
     223             : {
     224        3279 :   const pari_sp av = avma;
     225        3279 :   const long N_mod_8 = mod8(h->N), N_mod_4 = N_mod_8 & 3;
     226        3279 :   long dl = decimal_len(h->N);
     227        3279 :   long D = maxss(0, minss(dl,MPQS_MAX_DIGIT_SIZE_KN)-9);
     228        3279 :   long MPQS_MULTIPLIER_SEARCH_DEPTH = mpqs_parameters[D].size_of_FB;
     229             :   forprime_t S;
     230             :   struct {
     231             :     const mpqs_multiplier_t *_k;
     232             :     long np; /* number of primes in factorbase so far for this k */
     233             :     double value; /* the larger, the better */
     234             :   } cache[MPQS_POSSIBLE_MULTIPLIERS];
     235        3279 :   ulong MPQS_NB_MULTIPLIERS = dl < 40 ? 5 : MPQS_POSSIBLE_MULTIPLIERS;
     236             :   ulong p, i, nbk;
     237             : 
     238       32665 :   for (i = nbk = 0; i < numberof(cand_multipliers); i++)
     239             :   {
     240       32665 :     const mpqs_multiplier_t *cand_k = &cand_multipliers[i];
     241       32665 :     long k = cand_k->k;
     242             :     double v;
     243       32665 :     if ((k & 3) != N_mod_4) continue; /* want kN = 1 (mod 4) */
     244       17065 :     v = -log((double)k)/2;
     245       17065 :     if ((k & 7) == N_mod_8) v += M_LN2; /* kN = 1 (mod 8) */
     246       17065 :     cache[nbk].np = 0;
     247       17065 :     cache[nbk]._k = cand_k;
     248       17065 :     cache[nbk].value = v;
     249       17065 :     if (++nbk == MPQS_NB_MULTIPLIERS) break; /* enough */
     250             :   }
     251             :   /* next test is an impossible situation: kills spurious gcc-5.1 warnings
     252             :    * "array subscript is above array bounds" */
     253        3279 :   if (nbk > MPQS_POSSIBLE_MULTIPLIERS) nbk = MPQS_POSSIBLE_MULTIPLIERS;
     254        3279 :   u_forprime_init(&S, 2, ULONG_MAX);
     255      709244 :   while ( (p = u_forprime_next(&S)) )
     256             :   {
     257      709244 :     long kroNp = kroiu(h->N, p), seen = 0;
     258      709244 :     if (!kroNp) return p;
     259     5887884 :     for (i = 0; i < nbk; i++)
     260             :     {
     261             :       long krokp;
     262     5178640 :       if (cache[i].np > MPQS_MULTIPLIER_SEARCH_DEPTH) continue;
     263     4841990 :       seen++;
     264     4841990 :       krokp = krouu(cache[i]._k->k % p, p);
     265     4841990 :       if (krokp == kroNp) /* kronecker(k*N, p)=1 */
     266             :       {
     267     2409822 :         cache[i].value += 2*log((double) p)/p;
     268     2409822 :         cache[i].np++;
     269     2432168 :       } else if (krokp == 0)
     270             :       {
     271       18823 :         cache[i].value += log((double) p)/p;
     272       18823 :         cache[i].np++;
     273             :       }
     274             :     }
     275      709244 :     if (!seen) break; /* we're gone through SEARCH_DEPTH primes for all k */
     276             :   }
     277        3279 :   if (!p) pari_err_OVERFLOW("mpqs_find_k [ran out of primes]");
     278             :   {
     279        3279 :     long best_i = 0;
     280        3279 :     double v = cache[0].value;
     281       17065 :     for (i = 1; i < nbk; i++)
     282       13786 :       if (cache[i].value > v) { best_i = i; v = cache[i].value; }
     283        3279 :     h->_k = cache[best_i]._k; return gc_ulong(av,0);
     284             :   }
     285             : }
     286             : 
     287             : /* Create a factor base of 'size' primes p_i such that legendre(k*N, p_i) != -1
     288             :  * We could have shifted subscripts down from their historical arrangement,
     289             :  * but this seems too risky for the tiny potential gain in memory economy.
     290             :  * The real constraint is that the subscripts of anything which later shows
     291             :  * up at the Gauss stage must be nonnegative, because the exponent vectors
     292             :  * there use the same subscripts to refer to the same FB entries.  Thus in
     293             :  * particular, the entry representing -1 could be put into FB[0], but could
     294             :  * not be moved to FB[-1] (although mpqs_FB_ctor() could be easily adapted
     295             :  * to support negative subscripts).-- The historically grown layout is:
     296             :  * FB[0] is unused.
     297             :  * FB[1] is not explicitly used but stands for -1.
     298             :  * FB[2] contains 2 (always).
     299             :  * Before we are called, the size_of_FB field in the handle will already have
     300             :  * been adjusted by _k->omega_k, so there's room for the primes dividing k,
     301             :  * which when present will occupy FB[3] and following.
     302             :  * The "real" odd FB primes begin at FB[h->index0_FB].
     303             :  * FB[size_of_FB+1] is the last prime p_i.
     304             :  * FB[size_of_FB+2] is a sentinel to simplify some of our loops.
     305             :  * Thus we allocate size_of_FB+3 slots for FB.
     306             :  *
     307             :  * If a prime factor of N is found during the construction, it is returned
     308             :  * in f, otherwise f = 0. */
     309             : 
     310             : /* returns the FB array pointer for convenience */
     311             : static mpqs_FB_entry_t *
     312        3307 : mpqs_create_FB(mpqs_handle_t *h, ulong *f)
     313             : {
     314        3307 :   mpqs_FB_entry_t *FB = mpqs_FB_ctor(h);
     315        3307 :   const pari_sp av = avma;
     316        3307 :   mpqs_int32_t size = h->size_of_FB;
     317             :   long i;
     318        3307 :   mpqs_uint32_t k = h->_k->k;
     319             :   forprime_t S;
     320             : 
     321        3307 :   h->largest_FB_p = 0; /* -Wall */
     322        3307 :   FB[1].fbe_p = -1;
     323        3307 :   FB[2].fbe_p = 2;
     324             :   /* the fbe_logval and the fbe_sqrt_kN for 2 are never used */
     325        3307 :   FB[2].fbe_flags = MPQS_FBE_CLEAR;
     326        5725 :   for (i = 3; i < h->index0_FB; i++)
     327             :   { /* this loop executes h->_k->omega_k = 0, 1, or 2 times */
     328        2418 :     mpqs_uint32_t kp = (ulong)h->_k->kp[i-3];
     329        2418 :     if (MPQS_DEBUGLEVEL >= 7) err_printf(",<%lu>", (ulong)kp);
     330        2418 :     FB[i].fbe_p = kp;
     331             :     /* we could flag divisors of k here, but no need so far */
     332        2418 :     FB[i].fbe_flags = MPQS_FBE_CLEAR;
     333        2418 :     FB[i].fbe_flogp = (float)log2((double) kp);
     334        2418 :     FB[i].fbe_sqrt_kN = 0;
     335             :   }
     336        3307 :   (void)u_forprime_init(&S, 3, ULONG_MAX);
     337      698659 :   while (i < size + 2)
     338             :   {
     339      695352 :     ulong p = u_forprime_next(&S);
     340      695352 :     if (p > k || k % p)
     341             :     {
     342      692934 :       ulong kNp = umodiu(h->kN, p);
     343      692934 :       long kr = krouu(kNp, p);
     344      692934 :       if (kr >= 0)
     345             :       {
     346      354186 :         FB[i].fbe_flags = MPQS_FBE_CLEAR;
     347      354186 :         if (kr == 0)
     348             :         {
     349         105 :           if (f) { *f = p; return FB; }
     350         105 :           if (Z_lval(h->kN, p) > 1) continue;
     351          98 :           FB[i].fbe_flags = MPQS_FBE_DIVIDES_N;
     352             :         }
     353      354179 :         FB[i].fbe_p = (mpqs_uint32_t) p;
     354             :         /* dyadic logarithm of p; single precision suffices */
     355      354179 :         FB[i].fbe_flogp = (float)log2((double)p);
     356             :         /* cannot yet fill in fbe_logval because the scaling multiplier
     357             :          * depends on the largest prime in FB, as yet unknown */
     358             : 
     359             :         /* x such that x^2 = kN (mod p_i) */
     360      354179 :         FB[i++].fbe_sqrt_kN = (mpqs_uint32_t)Fl_sqrt(kNp, p);
     361             :       }
     362             :     }
     363             :   }
     364        3307 :   set_avma(av);
     365        3307 :   if (MPQS_DEBUGLEVEL >= 7)
     366             :   {
     367           0 :     err_printf("MPQS: FB [-1,2");
     368           0 :     for (i = 3; i < h->index0_FB; i++) err_printf(",<%lu>", FB[i].fbe_p);
     369           0 :     for (; i < size + 2; i++) err_printf(",%lu", FB[i].fbe_p);
     370           0 :     err_printf("]\n");
     371             :   }
     372             : 
     373        3307 :   FB[i].fbe_p = 0;              /* sentinel */
     374        3307 :   h->largest_FB_p = FB[i-1].fbe_p; /* at subscript size_of_FB + 1 */
     375             : 
     376             :   /* locate the smallest prime that will be used for sieving */
     377        6079 :   for (i = h->index0_FB; FB[i].fbe_p != 0; i++)
     378        6079 :     if (FB[i].fbe_p >= h->pmin_index1) break;
     379        3307 :   h->index1_FB = i;
     380             :   /* with our parameters this will never fall off the end of the FB */
     381        3307 :   if (f) *f = 0;
     382        3307 :   return FB;
     383             : }
     384             : 
     385             : /*********************************************************************/
     386             : /**                      MISC HELPER FUNCTIONS                      **/
     387             : /*********************************************************************/
     388             : 
     389             : /* Effect of the following:  multiplying the base-2 logarithm of some
     390             :  * quantity by log_multiplier will rescale something of size
     391             :  *    log2 ( sqrt(kN) * M / (largest_FB_prime)^tolerance )
     392             :  * to 232.  Note that sqrt(kN) * M is just A*M^2, the value our polynomials
     393             :  * take at the outer edges of the sieve interval.  The scale here leaves
     394             :  * a little wiggle room for accumulated rounding errors from the approximate
     395             :  * byte-sized scaled logarithms for the factor base primes which we add up
     396             :  * in the sieving phase.-- The threshold is then chosen so that a point in
     397             :  * the sieve has to reach a result which, under the same scaling, represents
     398             :  *    log2 ( sqrt(kN) * M / (largest_FB_prime)^tolerance )
     399             :  * in order to be accepted as a candidate. */
     400             : /* The old formula was...
     401             :  *   log_multiplier =
     402             :  *      127.0 / (0.5 * log2 (handle->dkN) + log2((double)M)
     403             :  *               - tolerance * log2((double)handle->largest_FB_p));
     404             :  * and we used to use this with a constant threshold of 128. */
     405             : 
     406             : /* NOTE: We used to divide log_multiplier by an extra factor 2, and in
     407             :  * compensation we were multiplying by 2 when the fbe_logp fields were being
     408             :  * filled in, making all those bytes even.  Tradeoff: the extra bit of
     409             :  * precision is helpful, but interferes with a possible sieving optimization
     410             :  * (artificially shift right the logp's of primes in A, and just run over both
     411             :  * arithmetical progressions  (which coincide in this case)  instead of
     412             :  * skipping the second one, to avoid the conditional branch in the
     413             :  * mpqs_sieve() loops).  We could still do this, but might lose a little bit
     414             :  * accuracy for those primes.  Probably no big deal. */
     415             : static void
     416        3307 : mpqs_set_sieve_threshold(mpqs_handle_t *h)
     417             : {
     418        3307 :   mpqs_FB_entry_t *FB = h->FB;
     419             :   double log_maxval, log_multiplier;
     420             :   long i;
     421             : 
     422        3307 :   h->l2sqrtkN = 0.5 * log2(h->dkN);
     423        3307 :   h->l2M = log2((double)h->M);
     424        3307 :   log_maxval = h->l2sqrtkN + h->l2M - MPQS_A_FUDGE;
     425        3307 :   log_multiplier = 232.0 / log_maxval;
     426        3307 :   h->sieve_threshold = (unsigned char) (log_multiplier *
     427        3307 :     (log_maxval - h->tolerance * log2((double)h->largest_FB_p))) + 1;
     428             :   /* That "+ 1" really helps - we may want to tune towards somewhat smaller
     429             :    * tolerances  (or introduce self-tuning one day)... */
     430             : 
     431             :   /* If this turns out to be <128, scream loudly.
     432             :    * That means that the FB or the tolerance or both are way too
     433             :    * large for the size of kN.  (Normally, the threshold should end
     434             :    * up in the 150...170 range.) */
     435        3307 :   if (h->sieve_threshold < 128) {
     436           0 :     h->sieve_threshold = 128;
     437           0 :     pari_warn(warner,
     438             :         "MPQS: sizing out of tune, FB size or tolerance\n\ttoo large");
     439             :   }
     440        3307 :   if (DEBUGLEVEL >= 5)
     441           0 :     err_printf("MPQS: sieve threshold: %ld\n",h->sieve_threshold);
     442             :   /* Now fill in the byte-sized approximate scaled logarithms of p_i */
     443        3307 :   if (DEBUGLEVEL >= 5)
     444           0 :     err_printf("MPQS: computing logarithm approximations for p_i in FB\n");
     445      357486 :   for (i = h->index0_FB; i < h->size_of_FB + 2; i++)
     446      354179 :     FB[i].fbe_logval = (unsigned char) (log_multiplier * FB[i].fbe_flogp);
     447        3307 : }
     448             : 
     449             : /* Given the partially populated handle, find the optimum place in the FB
     450             :  * to pick prime factors for A from.  The lowest admissible subscript is
     451             :  * index0_FB, but unless kN is very small, we stay away a bit from that.
     452             :  * The highest admissible is size_of_FB + 1, where the largest FB prime
     453             :  * resides.  The ideal corner is about (sqrt(kN)/M) ^ (1/omega_A),
     454             :  * so that A will end up of size comparable to sqrt(kN)/M;  experimentally
     455             :  * it seems desirable to stay slightly below this.  Moreover, the selection
     456             :  * of the individual primes happens to err on the large side, for which we
     457             :  * compensate a bit, using the (small positive) quantity MPQS_A_FUDGE.
     458             :  * We rely on a few auxiliary fields in the handle to be already set by
     459             :  * mqps_set_sieve_threshold() before we are called.
     460             :  * Return 1 on success, and 0 otherwise. */
     461             : static int
     462        3307 : mpqs_locate_A_range(mpqs_handle_t *h)
     463             : {
     464             :   /* i will be counted up to the desirable index2_FB + 1, and omega_A is never
     465             :    * less than 3, and we want
     466             :    *   index2_FB - (omega_A - 1) + 1 >= index0_FB + omega_A - 3,
     467             :    * so: */
     468        3307 :   long i = h->index0_FB + 2*(h->omega_A) - 4;
     469             :   double l2_target_pA;
     470        3307 :   mpqs_FB_entry_t *FB = h->FB;
     471             : 
     472        3307 :   h->l2_target_A = (h->l2sqrtkN - h->l2M - MPQS_A_FUDGE);
     473        3307 :   l2_target_pA = h->l2_target_A / h->omega_A;
     474             : 
     475             :   /* find the sweet spot, normally shouldn't take long */
     476       40547 :   while (FB[i].fbe_p && FB[i].fbe_flogp <= l2_target_pA) i++;
     477             : 
     478             :   /* check whether this hasn't walked off the top end... */
     479             :   /* The following should actually NEVER happen. */
     480        3307 :   if (i > h->size_of_FB - 3)
     481             :   { /* this isn't going to work at all. */
     482           0 :     pari_warn(warner,
     483             :         "MPQS: sizing out of tune, FB too small or\n\tway too few primes in A");
     484           0 :     return 0;
     485             :   }
     486        3307 :   h->index2_FB = i - 1; return 1;
     487             :   /* assert: index0_FB + (omega_A - 3) [the lowest FB subscript used in primes
     488             :    * for A]  + (omega_A - 2) <= index2_FB  [the subscript from which the choice
     489             :    * of primes for A starts, putting omega_A - 1 of them at or below index2_FB,
     490             :    * and the last and largest one above, cf. mpqs_si_choose_primes]. Moreover,
     491             :    * index2_FB indicates the last prime below the ideal size, unless (when kN
     492             :    * is tiny) the ideal size was too small to use. */
     493             : }
     494             : 
     495             : /*********************************************************************/
     496             : /**                       SELF-INITIALIZATION                       **/
     497             : /*********************************************************************/
     498             : 
     499             : #ifdef MPQS_DEBUG
     500             : /* Debug-only helper routine: check correctness of the root z mod p_i
     501             :  * by evaluating A * z^2 + B * z + C mod p_i  (which should be 0). */
     502             : static void
     503             : check_root(mpqs_handle_t *h, GEN mC, long p, long start)
     504             : {
     505             :   pari_sp av = avma;
     506             :   long z = start - ((long)(h->M) % p);
     507             :   if (umodiu(subii(mulsi(z, addii(h->B, mulsi(z, h->A))), mC), p))
     508             :   {
     509             :     err_printf("MPQS: p = %ld\n", p);
     510             :     err_printf("MPQS: A = %Ps\n", h->A);
     511             :     err_printf("MPQS: B = %Ps\n", h->B);
     512             :     err_printf("MPQS: C = %Ps\n", negi(mC));
     513             :     err_printf("MPQS: z = %ld\n", z);
     514             :     pari_err_BUG("MPQS: self_init: found wrong polynomial");
     515             :   }
     516             :   set_avma(av);
     517             : }
     518             : #endif
     519             : 
     520             : /* Increment *x > 0 to a larger value which has the same number of 1s in its
     521             :  * binary representation.  Wraparound can be detected by the caller as long as
     522             :  * we keep total_no_of_primes_for_A strictly less than BITS_IN_LONG.
     523             :  *
     524             :  * Changed switch to increment *x in all cases to the next larger number
     525             :  * which (a) has the same count of 1 bits and (b) does not arise from the
     526             :  * old value by moving a single 1 bit one position to the left  (which was
     527             :  * undesirable for the sieve). --GN based on discussion with TP */
     528             : INLINE void
     529        9249 : mpqs_increment(mpqs_uint32_t *x)
     530             : {
     531        9249 :   mpqs_uint32_t r1_mask, r01_mask, slider=1UL;
     532             : 
     533        9249 :   switch (*x & 0x1F)
     534             :   { /* 32-way computed jump handles 22 out of 32 cases */
     535         106 :   case 29:
     536         106 :     (*x)++; break; /* shifts a single bit, but we postprocess this case */
     537           0 :   case 26:
     538           0 :     (*x) += 2; break; /* again */
     539        6304 :   case 1: case 3: case 6: case 9: case 11:
     540             :   case 17: case 19: case 22: case 25: case 27:
     541        6304 :     (*x) += 3; return;
     542         105 :   case 20:
     543         105 :     (*x) += 4; break; /* again */
     544         301 :   case 5: case 12: case 14: case 21:
     545         301 :     (*x) += 5; return;
     546        1442 :   case 2: case 7: case 13: case 18: case 23:
     547        1442 :     (*x) += 6; return;
     548           0 :   case 10:
     549           0 :     (*x) += 7; return;
     550           0 :   case 8:
     551           0 :     (*x) += 8; break; /* and again */
     552         310 :   case 4: case 15:
     553         310 :     (*x) += 12; return;
     554         681 :   default: /* 0, 16, 24, 28, 30, 31 */
     555             :     /* isolate rightmost 1 */
     556         681 :     r1_mask = ((*x ^ (*x - 1)) + 1) >> 1;
     557             :     /* isolate rightmost 1 which has a 0 to its left */
     558         681 :     r01_mask = ((*x ^ (*x + r1_mask)) + r1_mask) >> 2;
     559             :     /* simple cases.  Both of these shift a single bit one position to the
     560             :        left, and will need postprocessing */
     561         681 :     if (r1_mask == r01_mask) { *x += r1_mask; break; }
     562         660 :     if (r1_mask == 1) { *x += r01_mask; break; }
     563             :     /* General case: add r01_mask, kill off as many 1 bits as possible to its
     564             :      * right while at the same time filling in 1 bits from the LSB. */
     565         529 :     if (r1_mask == 2) { *x += (r01_mask>>1) + 1; return; }
     566         811 :     while (r01_mask > r1_mask && slider < r1_mask)
     567             :     {
     568         536 :       r01_mask >>= 1; slider <<= 1;
     569             :     }
     570         275 :     *x += r01_mask + slider - 1;
     571         275 :     return;
     572             :   }
     573             :   /* post-process cases which couldn't be finalized above */
     574         363 :   r1_mask = ((*x ^ (*x - 1)) + 1) >> 1;
     575         363 :   r01_mask = ((*x ^ (*x + r1_mask)) + r1_mask) >> 2;
     576         363 :   if (r1_mask == r01_mask) { *x += r1_mask; return; }
     577         356 :   if (r1_mask == 1) { *x += r01_mask; return; }
     578         225 :   if (r1_mask == 2) { *x += (r01_mask>>1) + 1; return; }
     579         259 :   while (r01_mask > r1_mask && slider < r1_mask)
     580             :   {
     581         140 :     r01_mask >>= 1; slider <<= 1;
     582             :   }
     583         119 :   *x += r01_mask + slider - 1;
     584             : }
     585             : 
     586             : /* self-init (1): advancing the bit pattern, and choice of primes for A.
     587             :  * On first call, h->bin_index = 0. On later occasions, we need to begin
     588             :  * by clearing the MPQS_FBE_DIVIDES_A bit in the fbe_flags of the former
     589             :  * prime factors of A (use per_A_pr to find them). Upon successful return, that
     590             :  * array will have been filled in, and the flag bits will have been turned on
     591             :  * again in the right places.
     592             :  * Return 1 when all is fine and 0 when we found we'd be using more bits to
     593             :  * the left in bin_index than we have matching primes in the FB. In the latter
     594             :  * case, bin_index will be zeroed out, index2_FB will be incremented by 2,
     595             :  * index2_moved will be turned on; the caller, after checking that index2_FB
     596             :  * has not become too large, should just call us again, which then succeeds:
     597             :  * we'll start again with a right-justified sequence of 1 bits in bin_index,
     598             :  * now interpreted as selecting primes relative to the new index2_FB. */
     599             : INLINE int
     600       12620 : mpqs_si_choose_primes(mpqs_handle_t *h, GEN missing_primes)
     601             : {
     602       12620 :   mpqs_FB_entry_t *FB = h->FB;
     603       12620 :   mpqs_per_A_prime_t *per_A_pr = h->per_A_pr;
     604       12620 :   double l2_last_p = h->l2_target_A;
     605       12620 :   mpqs_int32_t omega_A = h->omega_A;
     606             :   int i, j, v2, prev_last_p_idx;
     607       12620 :   int room = h->index2_FB - h->index0_FB - omega_A + 4;
     608             :   /* The notion of room here (cf mpqs_locate_A_range() above) is the number
     609             :    * of primes at or below index2_FB which are eligible for A. We need
     610             :    * >= omega_A - 1 of them, and it is guaranteed by mpqs_locate_A_range() that
     611             :    * this many are available: the lowest FB slot used for A is never less than
     612             :    * index0_FB + omega_A - 3. When omega_A = 3 (very small kN), we allow
     613             :    * ourselves to reach all the way down to index0_FB; otherwise, we keep away
     614             :    * from it by at least one position.  For omega_A >= 4 this avoids situations
     615             :    * where the selection of the smaller primes here has advanced to a lot of
     616             :    * very small ones, and the single last larger one has soared away to bump
     617             :    * into the top end of the FB. */
     618             :   mpqs_uint32_t room_mask;
     619             :   mpqs_int32_t p;
     620             :   ulong bits;
     621             : 
     622             :   /* XXX also clear the index_j field here? */
     623       12620 :   if (h->bin_index == 0)
     624             :   { /* first time here, or after increasing index2_FB, initialize to a pattern
     625             :      * of omega_A - 1 consecutive 1 bits. Caller has ensured that there are
     626             :      * enough primes for this in the FB below index2_FB. */
     627        3371 :     h->bin_index = (1UL << (omega_A - 1)) - 1;
     628        3371 :     prev_last_p_idx = 0;
     629             :   }
     630             :   else
     631             :   { /* clear out old flags */
     632       46629 :     for (i = 0; i < omega_A; i++) MPQS_FLG(i) &= ~MPQS_FBE_DIVIDES_A;
     633        9249 :     prev_last_p_idx = MPQS_I(omega_A-1);
     634             : 
     635        9249 :     if (room > 30) room = 30;
     636        9249 :     room_mask = ~((1UL << room) - 1);
     637             : 
     638             :     /* bump bin_index to next acceptable value. If index2_moved is off, call
     639             :      * mpqs_increment() once; otherwise, repeat until there's something in the
     640             :      * least significant 2 bits - to ensure that we never re-use an A which
     641             :      * we'd used before increasing index2_FB - but also stop if something shows
     642             :      * up in the forbidden bits on the left where we'd run out of bits or walk
     643             :      * beyond index0_FB + omega_A - 3. */
     644        9249 :     mpqs_increment(&h->bin_index);
     645        9249 :     if (h->index2_moved)
     646             :     {
     647          25 :       while ((h->bin_index & (room_mask | 0x3)) == 0)
     648           0 :         mpqs_increment(&h->bin_index);
     649             :     }
     650             :     /* did we fall off the edge on the left? */
     651        9249 :     if ((h->bin_index & room_mask) != 0)
     652             :     { /* Yes. Turn on the index2_moved flag in the handle */
     653          64 :       h->index2_FB += 2; /* caller to check this isn't too large!!! */
     654          64 :       h->index2_moved = 1;
     655          64 :       h->bin_index = 0;
     656          64 :       if (MPQS_DEBUGLEVEL >= 5)
     657           0 :         err_printf("MPQS: wrapping, more primes for A now chosen near FB[%ld] = %ld\n",
     658           0 :                    (long)h->index2_FB,
     659           0 :                    (long)FB[h->index2_FB].fbe_p);
     660          64 :       return 0; /* back off - caller should retry */
     661             :     }
     662             :   }
     663             :   /* assert: we aren't occupying any of the room_mask bits now, and if
     664             :    * index2_moved had already been on, at least one of the two LSBs is on */
     665       12556 :   bits = h->bin_index;
     666       12556 :   if (MPQS_DEBUGLEVEL >= 6)
     667           0 :     err_printf("MPQS: new bit pattern for primes for A: 0x%lX\n", bits);
     668             : 
     669             :   /* map bits to FB subscripts, counting downward with bit 0 corresponding
     670             :    * to index2_FB, and accumulate logarithms against l2_last_p */
     671       12556 :   j = h->index2_FB;
     672       12556 :   v2 = vals((long)bits);
     673       12556 :   if (v2) { j -= v2; bits >>= v2; }
     674       35553 :   for (i = omega_A - 2; i >= 0; i--)
     675             :   {
     676       35553 :     MPQS_I(i) = j;
     677       35553 :     l2_last_p -= MPQS_LP(i);
     678       35553 :     if (MPQS_FLG(i) & MPQS_FBE_DIVIDES_N) return 0; /*Retry*/
     679       35406 :     MPQS_FLG(i) |= MPQS_FBE_DIVIDES_A;
     680       35406 :     bits &= ~1UL;
     681       35406 :     if (!bits) break; /* i = 0 */
     682       22997 :     v2 = vals((long)bits); /* > 0 */
     683       22997 :     bits >>= v2; j -= v2;
     684             :   }
     685       12409 :   if (missing_primes)
     686             :   {
     687         231 :     long lm = lg(missing_primes)-1;
     688         231 :     j = missing_primes[1+(h->bin_index%lm)];
     689         231 :     j += h->index0_FB-1;
     690         231 :     if (h->two_is_norm) j--;
     691         231 :     if (FB[j].fbe_flags & MPQS_FBE_DIVIDES_A) return 0; /* Retry */
     692         231 :     p = FB[j].fbe_p;
     693             :   } else
     694             :   {
     695             :     /* Choose the larger prime.  Note we keep index2_FB <= size_of_FB - 3 */
     696      157231 :     for (j = h->index2_FB + 1; (p = FB[j].fbe_p); j++)
     697      157104 :       if (!(FB[j].fbe_flags & MPQS_FBE_DIVIDES_N) && FB[j].fbe_flogp > l2_last_p)
     698       12051 :         break;
     699             :     /* The following trick avoids generating a large proportion of duplicate
     700             :      * relations when the last prime falls into an area where there are large
     701             :      * gaps from one FB prime to the next, and would otherwise often be repeated
     702             :      * (so that successive A's would wind up too similar to each other). While
     703             :      * this trick isn't perfect, it gets rid of a major part of the potential
     704             :      * duplication. */
     705       12178 :     if (p && j == prev_last_p_idx) { j++; p = FB[j].fbe_p; }
     706             :   }
     707       12409 :   MPQS_I(omega_A - 1) = p? j: h->size_of_FB + 1;
     708       12409 :   if (MPQS_FLG(omega_A - 1) & MPQS_FBE_DIVIDES_N) return 0; /*Retry*/
     709       12409 :   MPQS_FLG(omega_A - 1) |= MPQS_FBE_DIVIDES_A;
     710             : 
     711       12409 :   if (MPQS_DEBUGLEVEL >= 6)
     712             :   {
     713           0 :     err_printf("MPQS: chose primes for A");
     714           0 :     for (i = 0; i < omega_A; i++)
     715           0 :       err_printf(" FB[%ld]=%ld%s", (long)MPQS_I(i), (long)MPQS_AP(i),
     716           0 :                  i < omega_A - 1 ? "," : "\n");
     717             :   }
     718       12409 :   return 1;
     719             : }
     720             : 
     721             : /* There are 4 parts to self-initialization, exercised at different times:
     722             :  * - choosing a new sqfree coef. A (selecting its prime factors, FB bookkeeping)
     723             :  * - doing the actual computations attached to a new A
     724             :  * - choosing a new B keeping the same A (much simpler)
     725             :  * - a small common bit that needs to happen in both cases.
     726             :  * As to the first item, the scheme works as follows: pick omega_A - 1 prime
     727             :  * factors for A below the index2_FB point which marks their ideal size, and
     728             :  * one prime above this point, choosing the latter so log2(A) ~ l2_target_A.
     729             :  * Lower prime factors are chosen using bit patterns of constant weight,
     730             :  * gradually moving away from index2_FB towards smaller FB subscripts.
     731             :  * If this bumps into index0_FB (for very small input), back up by increasing
     732             :  * index2_FB by two, and from then on choosing only bit patterns with either or
     733             :  * both of their bottom bits set, so at least one of the omega_A - 1 smaller
     734             :  * prime factor will be beyond the original index2_FB point. In this way we
     735             :  * avoid re-using the same A. (The choice of the upper "flyer" prime is
     736             :  * constrained by the size of the FB, which normally should never a problem.
     737             :  * For tiny kN, we might have to live with a nonoptimal choice.)
     738             :  *
     739             :  * Mathematically, we solve a quadratic (over F_p for each prime p in the FB
     740             :  * which doesn't divide A), a linear equation for each prime p | A, and
     741             :  * precompute differences between roots mod p so we can adjust the roots
     742             :  * quickly when we change B. See Thomas Sosnowski's Diplomarbeit. */
     743             : /* compute coefficients of sieving polynomial for self initializing variant.
     744             :  * Coefficients A and B are set (preallocated GENs) and several tables are
     745             :  * updated. */
     746             : static int
     747      143255 : mpqs_self_init(mpqs_handle_t *h, GEN missing_primes)
     748             : {
     749      143255 :   const ulong size_of_FB = h->size_of_FB + 1;
     750      143255 :   mpqs_FB_entry_t *FB = h->FB;
     751      143255 :   mpqs_inv_A_H_t *inv_A_H = h->inv_A_H;
     752      143255 :   const pari_sp av = avma;
     753      143255 :   GEN p1, A = h->A, B = h->B, mC;
     754      143255 :   mpqs_per_A_prime_t *per_A_pr = h->per_A_pr;
     755             :   long i, j;
     756             : 
     757             : #ifdef MPQS_DEBUG
     758             :   err_printf("MPQS DEBUG: enter self init, avma = 0x%lX\n", (ulong)avma);
     759             : #endif
     760      143255 :   if (++h->index_j == (mpqs_uint32_t)h->no_B)
     761             :   { /* all the B's have been used, choose new A; this is indicated by setting
     762             :      * index_j to 0 */
     763        8752 :     h->index_j = 0;
     764        8752 :     h->index_i++; /* count finished A's */
     765             :   }
     766             : 
     767      143255 :   if (missing_primes || h->index_j == 0)
     768       12409 :   { /* compute first polynomial with new A */
     769             :     GEN a, b, A2;
     770             :     int err;
     771       12620 :     while((err=mpqs_si_choose_primes(h, missing_primes))<=0)
     772             :     {
     773             :       /* Ran out of room towards small primes, and index2_FB was raised. */
     774         211 :       if (err < 0) return 0;
     775         211 :       if (size_of_FB - h->index2_FB < 4) return 0; /* Fail */
     776             :     }
     777             :     /* bin_index and per_A_pr now populated with consistent values */
     778             : 
     779             :     /* compute A = product of omega_A primes given by bin_index */
     780       12409 :     a = b = NULL;
     781       59958 :     for (i = 0; i < h->omega_A; i++)
     782             :     {
     783       47549 :       ulong p = MPQS_AP(i);
     784       47549 :       a = a? muliu(a, p): utoipos(p);
     785             :     }
     786       12409 :     affii(a, A);
     787             :     /* Compute H[i], 0 <= i < omega_A.  Also compute the initial
     788             :      * B = sum(v_i*H[i]), by taking all v_i = +1
     789             :      * TODO: following needs to be changed later for segmented FB and sieve
     790             :      * interval, where we'll want to precompute several B's. */
     791       59958 :     for (i = 0; i < h->omega_A; i++)
     792             :     {
     793       47549 :       ulong p = MPQS_AP(i);
     794       47549 :       GEN t = divis(A, (long)p);
     795       47549 :       t = remii(mulii(t, muluu(Fl_inv(umodiu(t, p), p), MPQS_SQRT(i))), A);
     796       47549 :       affii(t, MPQS_H(i));
     797       47549 :       b = b? addii(b, t): t;
     798             :     }
     799       12409 :     if (mod2(h->kN))
     800             :     {
     801             :       /* ensure b = 1 mod 4 */
     802       12094 :       if (mod2(b) == 0)
     803        5785 :         b = addii(b, mului(mod4(A), A)); /* B += (A % 4) * A; */
     804             :     }
     805             :     else
     806             :     {
     807             :       /* ensure b = 0 mod 2 */
     808         315 :       if (mod2(b) != 0)
     809         210 :         b = addii(b, A);
     810             :     }
     811       12409 :     affii(b, B); set_avma(av);
     812             : 
     813       12409 :     A2 = shifti(A, 1);
     814             :     /* compute the roots z1, z2, of the polynomial Q(x) mod p_j and
     815             :      * initialize start1[i] with the first value p_i | Q(z1 + i p_j)
     816             :      * initialize start2[i] with the first value p_i | Q(z2 + i p_j)
     817             :      * The following loop does The Right Thing for primes dividing k (where
     818             :      * sqrt_kN is 0 mod p). Primes dividing A are skipped here, and are handled
     819             :      * further down in the common part of SI. */
     820     4226947 :     for (j = 3; (ulong)j <= size_of_FB; j++)
     821             :     {
     822             :       ulong s, mb, t, m, p, iA2, iA;
     823     4214538 :       if (FB[j].fbe_flags & MPQS_FBE_DIVIDES_A) continue;
     824     4166989 :       p = (ulong)FB[j].fbe_p;
     825     4166989 :       m = h->M % p;
     826     4166989 :       iA2 = Fl_inv(umodiu(A2, p), p); /* = 1/(2*A) mod p_j */
     827     4166989 :       iA = iA2 << 1; if (iA > p) iA -= p;
     828     4166989 :       mb = umodiu(B, p); if (mb) mb = p - mb; /* mb = -B mod p */
     829     4166989 :       s = FB[j].fbe_sqrt_kN;
     830     4166989 :       t = Fl_add(m, Fl_mul(Fl_sub(mb, s, p), iA2, p), p);
     831     4166989 :       FB[j].fbe_start1 = (mpqs_int32_t)t;
     832     4166989 :       FB[j].fbe_start2 = (mpqs_int32_t)Fl_add(t, Fl_mul(s, iA, p), p);
     833    25015170 :       for (i = 0; i < h->omega_A - 1; i++)
     834             :       {
     835    20848181 :         ulong h = umodiu(MPQS_H(i), p);
     836    20848181 :         MPQS_INV_A_H(i,j) = Fl_mul(h, iA, p); /* 1/A * H[i] mod p_j */
     837             :       }
     838             :     }
     839             :   }
     840             :   else
     841             :   { /* no "real" computation -- use recursive formula */
     842             :     /* The following exploits that B is the sum of omega_A terms +-H[i]. Each
     843             :      * time we switch to a new B, we choose a new pattern of signs; the
     844             :      * precomputation of the inv_A_H array allows us to change the two
     845             :      * arithmetic progressions equally fast. The choice of sign patterns does
     846             :      * not follow the bit pattern of the ordinal number of B in the current
     847             :      * cohort; rather, we use a Gray code, changing only one sign each time.
     848             :      * When the i-th rightmost bit of the new ordinal number index_j of B is 1,
     849             :      * the sign of H[i] is changed; the next bit to the left tells us whether
     850             :      * we should be adding or subtracting the difference term. We never need to
     851             :      * change the sign of H[omega_A-1] (the topmost one), because that would
     852             :      * just give us the same sieve items Q(x) again with the opposite sign
     853             :      * of x.  This is why we only precomputed inv_A_H up to i = omega_A - 2. */
     854      130846 :     ulong p, v2 = vals(h->index_j); /* new starting positions for sieving */
     855      130846 :     j = h->index_j >> v2;
     856      130846 :     p1 = shifti(MPQS_H(v2), 1);
     857      130846 :     if (j & 2)
     858             :     { /* j = 3 mod 4 */
     859    86565838 :       for (j = 3; (ulong)j <= size_of_FB; j++)
     860             :       {
     861    86506704 :         if (FB[j].fbe_flags & MPQS_FBE_DIVIDES_A) continue;
     862    86150112 :         p = (ulong)FB[j].fbe_p;
     863    86150112 :         FB[j].fbe_start1 = Fl_sub(FB[j].fbe_start1, MPQS_INV_A_H(v2,j), p);
     864    86150112 :         FB[j].fbe_start2 = Fl_sub(FB[j].fbe_start2, MPQS_INV_A_H(v2,j), p);
     865             :       }
     866       59134 :       p1 = addii(B, p1);
     867             :     }
     868             :     else
     869             :     { /* j = 1 mod 4 */
     870    90932534 :       for (j = 3; (ulong)j <= size_of_FB; j++)
     871             :       {
     872    90860822 :         if (FB[j].fbe_flags & MPQS_FBE_DIVIDES_A) continue;
     873    90455571 :         p = (ulong)FB[j].fbe_p;
     874    90455571 :         FB[j].fbe_start1 = Fl_add(FB[j].fbe_start1, MPQS_INV_A_H(v2,j), p);
     875    90455571 :         FB[j].fbe_start2 = Fl_add(FB[j].fbe_start2, MPQS_INV_A_H(v2,j), p);
     876             :       }
     877       71712 :       p1 = subii(B, p1);
     878             :     }
     879      130846 :     affii(p1, B);
     880             :   }
     881             : 
     882             :   /* p=2 is a special case.  start1[2], start2[2] are never looked at,
     883             :    * so don't bother setting them. */
     884             : 
     885             :   /* compute zeros of polynomials that have only one zero mod p since p | A */
     886      143255 :   mC = diviiexact(subii(h->kN, sqri(B)), shifti(A, 2)); /* coefficient -C */
     887      952647 :   for (i = 0; i < h->omega_A; i++)
     888             :   {
     889      809392 :     ulong p = MPQS_AP(i), s = h->M + Fl_div(umodiu(mC, p), umodiu(B, p), p);
     890      809392 :     FB[MPQS_I(i)].fbe_start1 = FB[MPQS_I(i)].fbe_start2 = (mpqs_int32_t)(s % p);
     891             :   }
     892             : #ifdef MPQS_DEBUG
     893             :   for (j = 3; j <= size_of_FB; j++)
     894             :   {
     895             :     check_root(h, mC, FB[j].fbe_p, FB[j].fbe_start1);
     896             :     check_root(h, mC, FB[j].fbe_p, FB[j].fbe_start2);
     897             :   }
     898             : #endif
     899      143255 :   if (MPQS_DEBUGLEVEL >= 6)
     900           0 :     err_printf("MPQS: chose Q_%ld(x) = %Ps x^2 %c %Ps x + C\n",
     901           0 :                (long) h->index_j, h->A,
     902           0 :                signe(h->B) < 0? '-': '+', absi_shallow(h->B));
     903      143255 :   set_avma(av);
     904             : #ifdef MPQS_DEBUG
     905             :   err_printf("MPQS DEBUG: leave self init, avma = 0x%lX\n", (ulong)avma);
     906             : #endif
     907      143255 :   return 1;
     908             : }
     909             : 
     910             : /*********************************************************************/
     911             : /**                           THE SIEVE                             **/
     912             : /*********************************************************************/
     913             : /* p4 = 4*p, logp ~ log(p), B/E point to the beginning/end of a sieve array */
     914             : INLINE void
     915      813450 : mpqs_sieve_p(unsigned char *B, unsigned char *E, long p4, long p,
     916             :              unsigned char logp)
     917             : {
     918      813450 :   unsigned char *e = E - p4;
     919             :   /* Unrolled loop. It might be better to let the compiler worry about this
     920             :    * kind of optimization, based on its knowledge of whatever useful tricks the
     921             :    * machine instruction set architecture is offering */
     922    19609859 :   while (e - B >= 0) /* signed comparison */
     923             :   {
     924    18796409 :     (*B) += logp, B += p;
     925    18796409 :     (*B) += logp, B += p;
     926    18796409 :     (*B) += logp, B += p;
     927    18796409 :     (*B) += logp, B += p;
     928             :   }
     929     2775360 :   while (E - B >= 0) (*B) += logp, B += p;
     930      813450 : }
     931             : 
     932             : INLINE void
     933   127050742 : mpqs_sieve_p1(unsigned char *B, unsigned char *E, long s1, long s2,
     934             :              unsigned char logp)
     935             : {
     936   569794774 :   while (E - B >= 0)
     937             :   {
     938   483046786 :     (*B) += logp, B += s1;
     939   483046786 :     if (E - B < 0) break;
     940   442744032 :     (*B) += logp, B += s2;
     941             :   }
     942   127050742 : }
     943             : 
     944             : INLINE void
     945    52976113 : mpqs_sieve_p2(unsigned char *B, unsigned char *E, long p4, long s1, long s2,
     946             :              unsigned char logp)
     947             : {
     948    52976113 :   unsigned char *e = E - p4;
     949             :   /* Unrolled loop. It might be better to let the compiler worry about this
     950             :    * kind of optimization, based on its knowledge of whatever useful tricks the
     951             :    * machine instruction set architecture is offering */
     952   791929393 :   while (e - B >= 0) /* signed comparison */
     953             :   {
     954   738953280 :     (*B) += logp, B += s1;
     955   738953280 :     (*B) += logp, B += s2;
     956   738953280 :     (*B) += logp, B += s1;
     957   738953280 :     (*B) += logp, B += s2;
     958   738953280 :     (*B) += logp, B += s1;
     959   738953280 :     (*B) += logp, B += s2;
     960   738953280 :     (*B) += logp, B += s1;
     961   738953280 :     (*B) += logp, B += s2;
     962             :   }
     963   162632918 :   while (E - B >= 0) {(*B) += logp, B += s1; if (E - B < 0) break; (*B) += logp, B += s2;}
     964    52976113 : }
     965             : static void
     966      143255 : mpqs_sieve(mpqs_handle_t *h)
     967             : {
     968      143255 :   long p, l = h->index1_FB;
     969      143255 :   mpqs_FB_entry_t *FB = &(h->FB[l]);
     970      143255 :   unsigned char *S = h->sieve_array, *Send = h->sieve_array_end;
     971      143255 :   long size = h->M << 1, size4 = size >> 3;
     972      143255 :   memset((void*)S, 0, size * sizeof(unsigned char));
     973    53931859 :   for (  ; (p = FB->fbe_p) && p <= size4; FB++) /* l++ */
     974             :   {
     975    53788604 :     unsigned char logp = FB->fbe_logval;
     976    53788604 :     long s1 = FB->fbe_start1, s2 = FB->fbe_start2;
     977             :     /* sieve with FB[l] from start1[l], and from start2[l] if s1 != s2 */
     978    53788604 :     if (s1 == s2) mpqs_sieve_p(S + s1, Send, p << 2, p, logp);
     979             :     else
     980             :     {
     981    52976113 :       if (s1>s2) lswap(s1,s2)
     982    52976113 :       mpqs_sieve_p2(S + s1, Send, p << 2, s2-s1,p+s1-s2, logp);
     983             :     }
     984             :   }
     985   127194956 :   for (   ; (p = FB->fbe_p) && p <= size; FB++) /* l++ */
     986             :   {
     987   127051701 :     unsigned char logp = FB->fbe_logval;
     988   127051701 :     long s1 = FB->fbe_start1, s2 = FB->fbe_start2;
     989             :     /* sieve with FB[l] from start1[l], and from start2[l] if s1 != s2 */
     990   127051701 :     if (s1 == s2) mpqs_sieve_p(S + s1, Send, p << 2, p, logp);
     991             :     else
     992             :     {
     993   127050742 :       if (s1>s2) lswap(s1,s2)
     994   127050742 :       mpqs_sieve_p1(S + s1, Send, s2-s1, p+s1-s2, logp);
     995             :     }
     996             :   }
     997      143255 :   for (    ; (p = FB->fbe_p); FB++)
     998             :   {
     999           0 :     unsigned char logp = FB->fbe_logval;
    1000           0 :     long s1 = FB->fbe_start1, s2 = FB->fbe_start2;
    1001           0 :     if (s1 < size) S[s1] += logp;
    1002           0 :     if (s2!=s1 && s2 < size) S[s2] += logp;
    1003             :   }
    1004      143255 : }
    1005             : 
    1006             : /* Could use the fact that 4 | M, but let the compiler worry about unrolling. */
    1007             : static long
    1008      143255 : mpqs_eval_sieve(mpqs_handle_t *h)
    1009             : {
    1010      143255 :   long x = 0, count = 0, M2 = h->M << 1;
    1011      143255 :   unsigned char t = h->sieve_threshold;
    1012      143255 :   unsigned char *S = h->sieve_array;
    1013      143255 :   mpqs_bit_array * U = (mpqs_bit_array *) S;
    1014      143255 :   long *cand = h->candidates;
    1015      143255 :   const long sizemask = sizeof(mpqs_mask);
    1016             : 
    1017             :   /* Exploiting the sentinel, we don't need to check for x < M2 in the inner
    1018             :    * while loop; more than makes up for the lack of explicit unrolling. */
    1019    10711999 :   while (count < MPQS_CANDIDATE_ARRAY_SIZE - 1)
    1020             :   {
    1021             :     long j, y;
    1022   504353795 :     while (!TEST(U[x]&mpqs_mask)) x++;
    1023    10711999 :     y = x*sizemask;
    1024   167189643 :     for (j=0; j<sizemask; j++, y++)
    1025             :     {
    1026   156620899 :       if (y >= M2)
    1027      143255 :         { cand[count] = 0; return count; }
    1028   156477644 :       if (S[y]>=t)
    1029      535318 :         cand[count++] = y;
    1030             :     }
    1031    10568744 :     x++;
    1032             :   }
    1033           0 :   cand[count] = 0; return count;
    1034             : }
    1035             : 
    1036             : /*********************************************************************/
    1037             : /**                     CONSTRUCTING RELATIONS                      **/
    1038             : /*********************************************************************/
    1039             : 
    1040             : /* only used for debugging */
    1041             : static void
    1042           0 : split_relp(GEN rel, GEN *prelp, GEN *prelc)
    1043             : {
    1044           0 :   long j, l = lg(rel);
    1045             :   GEN relp, relc;
    1046           0 :   *prelp = relp = cgetg(l, t_VECSMALL);
    1047           0 :   *prelc = relc = cgetg(l, t_VECSMALL);
    1048           0 :   for (j=1; j<l; j++)
    1049             :   {
    1050           0 :     relc[j] = rel[j] >> REL_OFFSET;
    1051           0 :     relp[j] = rel[j] & REL_MASK;
    1052             :   }
    1053           0 : }
    1054             : 
    1055             : #ifdef MPQS_DEBUG
    1056             : static GEN
    1057             : mpqs_factorback(mpqs_handle_t *h, GEN relp)
    1058             : {
    1059             :   GEN N = h->N, Q = gen_1;
    1060             :   long j, l = lg(relp);
    1061             :   for (j = 1; j < l; j++)
    1062             :   {
    1063             :     long e = relp[j] >> REL_OFFSET, i = relp[j] & REL_MASK;
    1064             :     if (i == 1) Q = Fp_neg(Q,N); /* special case -1 */
    1065             :     else Q = Fp_mul(Q, Fp_pows(utoipos(h->FB[i].fbe_p), e, N), N);
    1066             :   }
    1067             :   return Q;
    1068             : }
    1069             : static void
    1070             : mpqs_check_rel(mpqs_handle_t *h, GEN c, ulong q, long mode)
    1071             : {
    1072             :   pari_sp av = avma;
    1073             :   GEN Y = rel_Y(c), Qx_2 = remii(sqri(Y), h->N), rhs;
    1074             :   if (mode == MPQS_MODE_CLASSGROUP)
    1075             :   {
    1076             :     if (signe(Y)==0 || q!=1) { set_avma(av); return; }
    1077             :     q = 4;
    1078             :   }
    1079             :   rhs = Fp_mulu(mpqs_factorback(h, rel_p(c)), q, h->N);
    1080             :   if (!equalii(Qx_2, rhs))
    1081             :   {
    1082             :     GEN relpp, relpc;
    1083             :     split_relp(rel_p(c), &relpp, &relpc);
    1084             :     err_printf("MPQS: %Ps : %Ps %Ps\n", Y, relpp,relpc);
    1085             :     err_printf("\tQx_2 = %Ps\n", Qx_2);
    1086             :     err_printf("\t rhs = %Ps\n", rhs);
    1087             :     pari_err_BUG(q ? "MPQS: wrong large prime relation found"
    1088             :                    : "MPQS: wrong full relation found");
    1089             :   }
    1090             :   PRINT_IF_VERBOSE(q? "\b(;)": "\b(:)");
    1091             :   set_avma(av);
    1092             : }
    1093             : #endif
    1094             : 
    1095             : static void
    1096      414097 : rel_to_ei(GEN ei, GEN relp)
    1097             : {
    1098      414097 :   long j, l = lg(relp);
    1099     6303591 :   for (j=1; j<l; j++)
    1100             :   {
    1101     5889494 :     long e = relp[j] >> REL_OFFSET, i = relp[j] & REL_MASK;
    1102     5889494 :     ei[i] += e;
    1103             :   }
    1104      414097 : }
    1105             : 
    1106             : static void
    1107           0 : rel_add_ei(mpqs_handle_t *h, GEN ei, GEN relp, GEN b)
    1108             : {
    1109           0 :   long j, l = lg(relp);
    1110           0 :   for (j = 1; j < l; j++)
    1111             :   {
    1112           0 :     long e = relp[j] >> REL_OFFSET, i = relp[j] & REL_MASK;
    1113           0 :     ulong p = h->FB[i].fbe_p;
    1114           0 :     ei[i] += umodiu(b, p<<1) > p ? -e :e;
    1115             :   }
    1116           0 : }
    1117             : 
    1118             : static void
    1119           0 : rel_sub_ei(mpqs_handle_t *h, GEN ei, GEN relp, GEN b)
    1120             : {
    1121           0 :   long j, l = lg(relp);
    1122           0 :   for (j = 1; j < l; j++)
    1123             :   {
    1124           0 :     long e = relp[j] >> REL_OFFSET, i = relp[j] & REL_MASK;
    1125           0 :     ulong p = h->FB[i].fbe_p;
    1126           0 :     ei[i] -= umodiu(b, p<<1) > p ? -e :e;
    1127             :   }
    1128           0 : }
    1129             : 
    1130             : static void
    1131     7617829 : mpqs_add_factor(GEN relp, long *i, ulong ei, ulong pi)
    1132     7617829 : { relp[++*i] = pi | (ei << REL_OFFSET); }
    1133             : 
    1134             : static int
    1135       12720 : zv_is_even(GEN V)
    1136             : {
    1137       12720 :   long i, l = lg(V);
    1138     1098145 :   for (i=1; i<l; i++)
    1139     1097559 :     if (odd(uel(V,i))) return 0;
    1140         586 :   return 1;
    1141             : }
    1142             : 
    1143             : static GEN
    1144       12720 : combine_large_primes(mpqs_handle_t *h, ulong q, GEN rel1, GEN rel2, int mode)
    1145             : {
    1146       12720 :   GEN new_Y, new_Y1, Y1 = rel_Y(rel1), Y2 = rel_Y(rel2);
    1147       12720 :   long l, lei = h->size_of_FB + 1, nb = 0;
    1148             :   GEN ei, relp, iq;
    1149             : 
    1150       12720 :   if (!invmod(utoi(q), h->N, &iq)) return equalii(iq, h->N)? NULL: iq; /* rare */
    1151       12720 :   ei = zero_zv(lei);
    1152       12720 :   if (mode == MPQS_MODE_CLASSGROUP)
    1153             :   {
    1154           0 :      rel_add_ei(h, ei, rel_p(rel1), Y1);
    1155           0 :     if (umodiu(Y1, q) == umodiu(Y2,q))
    1156           0 :       rel_sub_ei(h, ei, rel_p(rel2), Y2);
    1157             :     else
    1158           0 :       rel_add_ei(h, ei, rel_p(rel2), Y2);
    1159           0 :     new_Y = gen_0;
    1160             :   }
    1161             :   else
    1162             :   {
    1163       12720 :     rel_to_ei(ei, rel_p(rel1));
    1164       12720 :     rel_to_ei(ei, rel_p(rel2));
    1165       12720 :     if (zv_is_even(ei)) return NULL;
    1166       12134 :     new_Y = modii(mulii(mulii(Y1, Y2), iq), h->N);
    1167       12134 :     new_Y1 = subii(h->N, new_Y);
    1168       12134 :     if (abscmpii(new_Y1, new_Y) < 0) new_Y = new_Y1;
    1169             :   }
    1170       12134 :   relp = cgetg(MAX_PE_PAIR+1,t_VECSMALL);
    1171       12134 :   if (odd(ei[1])) mpqs_add_factor(relp, &nb, 1, 1);
    1172    20934649 :   for (l = 2; l <= lei; l++)
    1173    20922515 :     if (ei[l]) mpqs_add_factor(relp, &nb, ei[l],l);
    1174       12134 :   setlg(relp, nb+1);
    1175       12134 :   if (DEBUGLEVEL >= 6)
    1176             :   {
    1177             :     GEN relpp, relpc, rel1p, rel1c, rel2p, rel2c;
    1178           0 :     split_relp(relp,&relpp,&relpc);
    1179           0 :     split_relp(rel1,&rel1p,&rel1c);
    1180           0 :     split_relp(rel2,&rel2p,&rel2c);
    1181           0 :     err_printf("MPQS: combining\n");
    1182           0 :     err_printf("    {%Ps @ %Ps : %Ps}\n", q, Y1, rel1p, rel1c);
    1183           0 :     err_printf("  * {%Ps @ %Ps : %Ps}\n", q, Y2, rel2p, rel2c);
    1184           0 :     err_printf(" == {%Ps, %Ps}\n", relpp, relpc);
    1185             :   }
    1186             : #ifdef MPQS_DEBUG
    1187             :   if (mode == MPQS_MODE_FACTOR)
    1188             :   {
    1189             :     pari_sp av1 = avma;
    1190             :     if (!equalii(modii(sqri(new_Y), h->N), mpqs_factorback(h, relp)))
    1191             :       pari_err_BUG("MPQS: combined large prime relation is false");
    1192             :     set_avma(av1);
    1193             :   }
    1194             : #endif
    1195       12134 :   return mkvec2(new_Y, relp);
    1196             : }
    1197             : 
    1198             : /* nc candidates */
    1199             : static GEN
    1200      133630 : mpqs_eval_cand(mpqs_handle_t *h, long nc, hashtable *frel, hashtable *lprel, int mode)
    1201             : {
    1202      133630 :   mpqs_FB_entry_t *FB = h->FB;
    1203      133630 :   GEN A = h->A, B = h->B;
    1204      133630 :   long *relaprimes = h->relaprimes, *candidates = h->candidates;
    1205             :   long pi, i;
    1206             :   int pii;
    1207      133630 :   mpqs_per_A_prime_t *per_A_pr = h->per_A_pr;
    1208      133630 :   int two_bad = h->two_is_bad;
    1209             : 
    1210      668948 :   for (i = 0; i < nc; i++)
    1211             :   {
    1212      535318 :     pari_sp btop = avma;
    1213      535318 :     GEN Qx, Qx_part, Y, relp = cgetg(MAX_PE_PAIR+1,t_VECSMALL);
    1214      535318 :     long powers_of_2, p, x = candidates[i], nb = 0;
    1215      535318 :     int relaprpos = 0;
    1216             :     long k;
    1217      535318 :     unsigned char thr = h->sieve_array[x];
    1218             :     /* Y = 2*A*x + B, Qx = Y^2/(4*A) = Q(x) */
    1219      535318 :     Y = addii(mulis(A, 2 * (x - h->M)), B);
    1220      535318 :     Qx = subii(sqri(Y), h->kN); /* != 0 since N not a square and (N,k) = 1 */
    1221      535318 :     if (signe(Qx) < 0)
    1222             :     {
    1223      281763 :       setabssign(Qx);
    1224      281763 :       mpqs_add_factor(relp, &nb, 1, 1); /* i = 1, ei = 1, pi */
    1225             :     }
    1226             :     /* Qx > 0, divide by powers of 2; we're really dealing with 4*A*Q(x), so we
    1227             :      * always have at least 2^2 here, and at least 2^3 when kN = 1 mod 4 */
    1228      535318 :     powers_of_2 = two_bad ? 2 : vali(Qx);
    1229      535318 :     Qx = shifti(Qx, -powers_of_2);
    1230      535318 :     if (mode == MPQS_MODE_CLASSGROUP)
    1231             :     {
    1232       11025 :       if (powers_of_2!=2)
    1233        5789 :         mpqs_add_factor(relp, &nb, powers_of_2 - 2, 2);
    1234             :     }
    1235             :     else
    1236      524293 :       mpqs_add_factor(relp, &nb, powers_of_2, 2); /* i = 1, ei = 1, pi */
    1237             :     /* When N is small, it may happen that N | Qx outright. In any case, when
    1238             :      * no extensive prior trial division / Rho / ECM was attempted, gcd(Qx,N)
    1239             :      * may turn out to be a nontrivial factor of N (not in FB or we'd have
    1240             :      * found it already, but possibly smaller than the large prime bound). This
    1241             :      * is too rare to check for here in the inner loop, but it will be caught
    1242             :      * if such an LP relation is ever combined with another. */
    1243             : 
    1244             :     /* Pass 1 over odd primes in FB: pick up all possible divisors of Qx
    1245             :      * including those sitting in k or in A, and remember them in relaprimes.
    1246             :      * Do not yet worry about possible repeated factors, these will be found in
    1247             :      * the Pass 2. Pass 1 recognizes divisors of A by their corresponding flags
    1248             :      * bit in the FB entry. (Divisors of k are ignored at this stage.)
    1249             :      * We construct a preliminary table of FB subscripts and "exponents" of FB
    1250             :      * primes which divide Qx. (We store subscripts, not the primes themselves.)
    1251             :      * We distinguish three cases:
    1252             :      * 0) prime in A which does not divide Qx/A,
    1253             :      * 1) prime not in A which divides Qx/A,
    1254             :      * 2) prime in A which divides Qx/A.
    1255             :      * Cases 1 and 2 need checking for repeated factors, kind 0 doesn't.
    1256             :      * Cases 0 and 1 contribute 1 to the exponent in the relation, case 2
    1257             :      * contributes 2.
    1258             :      * Factors in common with k are simpler: if they occur, they occur
    1259             :      * exactly to the first power, and this makes no difference in Pass 1,
    1260             :      * so they behave just like every normal odd FB prime. */
    1261     2684153 :     for (Qx_part = A, pi = 3; pi< h->index1_FB; pi++)
    1262             :     {
    1263     2148835 :       ulong p = FB[pi].fbe_p;
    1264     2148835 :       long xp = x % p;
    1265             :       /* Here we used that MPQS_FBE_DIVIDES_A = 1. */
    1266             : 
    1267     2148835 :       if (xp == FB[pi].fbe_start1 || xp == FB[pi].fbe_start2)
    1268             :       { /* p divides Q(x)/A and possibly A, case 2 or 3 */
    1269      537478 :         ulong ei = FB[pi].fbe_flags & MPQS_FBE_DIVIDES_A;
    1270      537478 :         relaprimes[relaprpos++] = pi;
    1271      537478 :         relaprimes[relaprpos++] = 1 + ei;
    1272      537478 :         Qx_part = muliu(Qx_part, p);
    1273             :       }
    1274             :     }
    1275   315496787 :     for (  ; thr && (p = FB[pi].fbe_p); pi++)
    1276             :     {
    1277   314961469 :       long xp = x % p;
    1278             :       /* Here we used that MPQS_FBE_DIVIDES_A = 1. */
    1279             : 
    1280   314961469 :       if (xp == FB[pi].fbe_start1 || xp == FB[pi].fbe_start2)
    1281             :       { /* p divides Q(x)/A and possibly A, case 2 or 3 */
    1282     3426086 :         ulong ei = FB[pi].fbe_flags & MPQS_FBE_DIVIDES_A;
    1283     3426086 :         relaprimes[relaprpos++] = pi;
    1284     3426086 :         relaprimes[relaprpos++] = 1 + ei;
    1285     3426086 :         Qx_part = muliu(Qx_part, p);
    1286     3426086 :         thr -= FB[pi].fbe_logval;
    1287             :       }
    1288             :     }
    1289     3150055 :     for (k = 0;  k< h->omega_A; k++)
    1290             :     {
    1291     2614737 :       long pi = MPQS_I(k);
    1292     2614737 :       ulong p = FB[pi].fbe_p;
    1293     2614737 :       long xp = x % p;
    1294     2614737 :       if (!(xp == FB[pi].fbe_start1 || xp == FB[pi].fbe_start2))
    1295             :       { /* p divides A but does not divide Q(x)/A, case 1 */
    1296     2536111 :         relaprimes[relaprpos++] = pi;
    1297     2536111 :         relaprimes[relaprpos++] = 0;
    1298             :       }
    1299             :     }
    1300             :     /* We have accumulated the known factors of Qx except for possible repeated
    1301             :      * factors and for possible large primes.  Divide off what we have.
    1302             :      * This is faster than dividing off A and each prime separately. */
    1303      535318 :     Qx = diviiexact(Qx, Qx_part);
    1304             : 
    1305             : #ifdef MPQS_DEBUG
    1306             :     err_printf("MPQS DEBUG: eval loop 3, avma = 0x%lX\n", (ulong)avma);
    1307             : #endif
    1308             :     /* Pass 2: deal with repeated factors and store tentative relation. At this
    1309             :      * point, the only primes which can occur again in the adjusted Qx are
    1310             :      * those in relaprimes which are followed by 1 or 2. We must pick up those
    1311             :      * followed by a 0, too. */
    1312             :     PRINT_IF_VERBOSE("a");
    1313     7034993 :     for (pii = 0; pii < relaprpos; pii += 2)
    1314             :     {
    1315     6499675 :       ulong r, ei = relaprimes[pii+1];
    1316             :       GEN q;
    1317             : 
    1318     6499675 :       pi = relaprimes[pii];
    1319             :       /* p | k (identified by its index before index0_FB)* or p | A (ei = 0) */
    1320     6499675 :       if ((mpqs_int32_t)pi < h->index0_FB || ei == 0)
    1321             :       {
    1322     2594242 :         mpqs_add_factor(relp, &nb, 1, pi);
    1323     2594242 :         continue;
    1324             :       }
    1325     3905433 :       p = FB[pi].fbe_p;
    1326             :       /* p might still divide the current adjusted Qx. Try it. */
    1327     3905433 :       switch(cmpiu(Qx, p))
    1328             :       {
    1329       88732 :         case 0: ei++; Qx = gen_1; break;
    1330     1253670 :         case 1:
    1331     1253670 :           q = absdiviu_rem(Qx, p, &r);
    1332     1340173 :           while (r == 0) { ei++; Qx = q; q = absdiviu_rem(Qx, p, &r); }
    1333     1253670 :           break;
    1334             :       }
    1335     3905433 :       mpqs_add_factor(relp, &nb, ei, pi);
    1336             :     }
    1337             : 
    1338             : #ifdef MPQS_DEBUG
    1339             :     err_printf("MPQS DEBUG: eval loop 4, avma = 0x%lX\n", (ulong)avma);
    1340             : #endif
    1341             :     PRINT_IF_VERBOSE("\bb");
    1342      535318 :     setlg(relp, nb+1);
    1343      535318 :     if (is_pm1(Qx))
    1344             :     {
    1345      376670 :       GEN rel = gc_GEN(btop, mkvec2(absi_shallow(Y), relp));
    1346             : #ifdef MPQS_DEBUG
    1347             :       mpqs_check_rel(h, rel, 1, mode);
    1348             : #endif
    1349      376670 :       frel_add(frel, rel);
    1350             :     }
    1351      158648 :     else if (cmpiu(Qx, h->lp_bound) <= 0 && (mode!=MPQS_MODE_CLASSGROUP
    1352             : #ifdef CLASSGROUP_LARGE_PRIME
    1353             : /* primes dividing the index are excluded from FB, so they can still divide Qx.
    1354             :    Add a crude check */
    1355             :             || is_pm1(gcdii(h->N, Qx))
    1356             : #endif
    1357             :             ))
    1358      145974 :     {
    1359      145974 :       ulong q = itou(Qx);
    1360      145974 :       GEN rel = mkvec2(absi_shallow(Y), relp);
    1361      145974 :       GEN col = hash_haskey_GEN(lprel, (void*)q);
    1362             : #ifdef MPQS_DEBUG
    1363             :       mpqs_check_rel(h, rel, q, mode);
    1364             : #endif
    1365      145974 :       if (!col) /* relation up to large prime */
    1366      133254 :         hash_insert(lprel, (void*)q, (void*)gc_GEN(btop,rel));
    1367       12720 :       else if ((rel = combine_large_primes(h, q, rel, col, mode)))
    1368             :       {
    1369       12134 :         if (typ(rel) == t_INT) return rel; /* very unlikely */
    1370             : #ifdef MPQS_DEBUG
    1371             :         mpqs_check_rel(h, rel, 1, mode);
    1372             : #endif
    1373       12134 :         frel_add(frel, gc_GEN(btop,rel));
    1374             :       }
    1375             :       else
    1376         586 :         set_avma(btop);
    1377             :     }
    1378             :     else
    1379             :     { /* TODO: check for double large prime */
    1380             :       PRINT_IF_VERBOSE("\b.");
    1381       12674 :       set_avma(btop);
    1382             :     }
    1383             :   }
    1384             :   PRINT_IF_VERBOSE("\n");
    1385      133630 :   return NULL;
    1386             : }
    1387             : 
    1388             : /*********************************************************************/
    1389             : /**                    FROM RELATIONS TO DIVISORS                   **/
    1390             : /*********************************************************************/
    1391             : 
    1392             : static GEN
    1393       10465 : rels_to_pairs(mpqs_handle_t *h, GEN relp)
    1394             : {
    1395       10465 :   long j, k, l = lg(relp);
    1396       10465 :   GEN P = cgetg(l, t_VECSMALL), E = cgetg(l, t_VECSMALL);
    1397      125643 :   for (j = 1, k = 1; j < l; j++)
    1398             :   {
    1399      115178 :     long e = relp[j] >> REL_OFFSET, i = relp[j] & REL_MASK;
    1400      115178 :     P[k] = i==1 ? -1: h->FB[i].fbe_p;
    1401      115178 :     E[k++] = e;
    1402             :   }
    1403       10465 :   setlg(P,k); setlg(E,k);
    1404       10465 :   return mkvec2(P,E);
    1405             : }
    1406             : 
    1407             : static GEN
    1408         336 : rels_to_quad(mpqs_handle_t *h, GEN rel)
    1409             : {
    1410         336 :   long i, cols = lg(rel)-1;
    1411             :   GEN m, idx;
    1412         336 :   idx = gen_indexsort(rel, (void *) cmp_universal, cmp_nodata);
    1413         336 :   m = cgetg(cols+1, t_VEC);
    1414       10801 :   for (i = 1; i <= cols; i++)
    1415             :   {
    1416       10465 :     GEN r = gel(rel, idx[i]), re = gel(r,2);
    1417       10465 :     GEN R = rels_to_pairs(h, re);
    1418       10465 :     gel(m, i) = mkvec2(gel(r, 1), R);
    1419             :   }
    1420         336 :   return m;
    1421             : }
    1422             : 
    1423             : /* create an F2m from a relations list */
    1424             : static GEN
    1425        3376 : rels_to_F2Ms(GEN rel)
    1426             : {
    1427        3376 :   long i, cols = lg(rel)-1;
    1428        3376 :   GEN m = cgetg(cols+1, t_VEC);
    1429      386441 :   for (i = 1; i <= cols; i++)
    1430             :   {
    1431      383065 :     GEN relp = gmael(rel,i,2), rel2;
    1432      383065 :     long j, l = lg(relp), o = 0, k;
    1433     5476865 :     for (j = 1; j < l; j++)
    1434     5093800 :       if (odd(relp[j] >> REL_OFFSET)) o++;
    1435      383065 :     rel2 = cgetg(o+1, t_VECSMALL);
    1436     5476865 :     for (j = 1, k = 1; j < l; j++)
    1437     5093800 :       if (odd(relp[j] >> REL_OFFSET))
    1438     4684467 :         rel2[k++] = relp[j] & REL_MASK;
    1439      383065 :     gel(m, i) = rel2;
    1440             :   }
    1441        3376 :   return m;
    1442             : }
    1443             : 
    1444             : static int
    1445        7124 : split(GEN *D, long *e)
    1446             : {
    1447             :   ulong mask;
    1448             :   long flag;
    1449        7124 :   if (MR_Jaeschke(*D)) { *e = 1; return 1; } /* probable prime */
    1450         431 :   if (Z_issquareall(*D, D))
    1451             :   { /* squares could cost us a lot of time */
    1452         120 :     if (DEBUGLEVEL >= 5) err_printf("MPQS: decomposed a square\n");
    1453         120 :     *e = 2; return 1;
    1454             :   }
    1455         311 :   mask = 7;
    1456             :   /* 5th/7th powers aren't worth the trouble. OTOH once we have the hooks for
    1457             :    * dealing with cubes, higher powers can be handled essentially for free) */
    1458         311 :   if ((flag = is_357_power(*D, D, &mask)))
    1459             :   {
    1460          14 :     if (DEBUGLEVEL >= 5)
    1461           0 :       err_printf("MPQS: decomposed a %s power\n", uordinal(flag));
    1462          14 :     *e = flag; return 1;
    1463             :   }
    1464         297 :   *e = 0; return 0; /* known composite */
    1465             : }
    1466             : 
    1467             : /* return a GEN structure containing NULL but safe for gc_upto */
    1468             : static GEN
    1469        3376 : mpqs_solve_linear_system(mpqs_handle_t *h, hashtable *frel)
    1470             : {
    1471        3376 :   mpqs_FB_entry_t *FB = h->FB;
    1472        3376 :   pari_sp av = avma;
    1473        3376 :   GEN rels = hash_keys_GEN(frel), N = h->N, r, c, res, ei, M, Ker;
    1474             :   long i, j, nrows, rlast, rnext, rmax, rank;
    1475             : 
    1476        3376 :   M = rels_to_F2Ms(rels);
    1477        3376 :   Ker = F2Ms_ker(M, h->size_of_FB+1); rank = lg(Ker)-1;
    1478        3376 :   if (DEBUGLEVEL >= 4)
    1479             :   {
    1480           0 :     if (DEBUGLEVEL >= 7)
    1481           0 :       err_printf("\\\\ MPQS RELATION MATRIX\nFREL=%Ps\nKERNEL=%Ps\n",M, Ker);
    1482           0 :     err_printf("MPQS: Gauss done: kernel has rank %ld, taking gcds...\n", rank);
    1483             :   }
    1484        3376 :   if (!rank)
    1485             :   { /* trivial kernel; main loop may look for more relations */
    1486           0 :     if (DEBUGLEVEL >= 3)
    1487           0 :       pari_warn(warner, "MPQS: no solutions found from linear system solver");
    1488           0 :     return gc_NULL(av); /* no factors found */
    1489             :   }
    1490             : 
    1491             :   /* Expect up to 2^rank pairwise coprime factors, but a kernel basis vector
    1492             :    * may not contribute to the decomposition; r stores the factors and c what
    1493             :    * we know about them (0: composite, 1: probably prime, >= 2: proper power) */
    1494        3376 :   ei = cgetg(h->size_of_FB + 2, t_VECSMALL);
    1495        3376 :   rmax = logint(N, utoi(3));
    1496        3376 :   if (rank <= BITS_IN_LONG-2)
    1497        3353 :     rmax = minss(rmax, 1L<<rank); /* max # of factors we can hope for */
    1498        3376 :   r = cgetg(rmax+1, t_VEC);
    1499        3376 :   c = cgetg(rmax+1, t_VECSMALL);
    1500        3376 :   rnext = rlast = 1;
    1501        3376 :   nrows = lg(M)-1;
    1502        9253 :   for (i = 1; i <= rank; i++)
    1503             :   { /* loop over kernel basis */
    1504        9142 :     GEN X = gen_1, Y_prod = gen_1, X_plus_Y, D;
    1505        9142 :     pari_sp av2 = avma, av3;
    1506        9142 :     long done = 0; /* # probably-prime factors or powers whose bases we won't
    1507             :                     * handle any further */
    1508        9142 :     memset((void *)(ei+1), 0, (h->size_of_FB + 1) * sizeof(long));
    1509     1121122 :     for (j = 1; j <= nrows; j++)
    1510     1111980 :       if (F2m_coeff(Ker, j, i))
    1511             :       {
    1512      388657 :         GEN R = gel(rels,j);
    1513      388657 :         Y_prod = gc_INT(av2, remii(mulii(Y_prod, gel(R,1)), N));
    1514      388657 :         rel_to_ei(ei, gel(R,2));
    1515             :       }
    1516        9142 :     av3 = avma;
    1517     1051452 :     for (j = 2; j <= h->size_of_FB + 1; j++)
    1518     1042310 :       if (ei[j])
    1519             :       {
    1520      661572 :         GEN q = utoipos(FB[j].fbe_p);
    1521      661572 :         if (ei[j] & 1) pari_err_BUG("MPQS (relation is a nonsquare)");
    1522      661572 :         X = remii(mulii(X, Fp_powu(q, (ulong)ei[j]>>1, N)), N);
    1523      661572 :         X = gc_INT(av3, X);
    1524             :       }
    1525        9142 :     if (MPQS_DEBUGLEVEL >= 1 && !dvdii(subii(sqri(X), sqri(Y_prod)), N))
    1526             :     {
    1527           0 :       err_printf("MPQS: X^2 - Y^2 != 0 mod N\n");
    1528           0 :       err_printf("\tindex i = %ld\n", i);
    1529           0 :       pari_warn(warner, "MPQS: wrong relation found after Gauss");
    1530             :     }
    1531             :     /* At this point, gcd(X-Y, N) * gcd(X+Y, N) = N:
    1532             :      * 1) N | X^2 - Y^2, so it divides the LHS;
    1533             :      * 2) let P be any prime factor of N. If P | X-Y and P | X+Y, then P | 2X
    1534             :      * But X is a product of powers of FB primes => coprime to N.
    1535             :      * Hence we work with gcd(X+Y, N) alone. */
    1536        9142 :     X_plus_Y = addii(X, Y_prod);
    1537        9142 :     if (rnext == 1)
    1538             :     { /* we still haven't decomposed, and want both a gcd and its cofactor. */
    1539        8599 :       D = gcdii(X_plus_Y, N);
    1540        8599 :       if (is_pm1(D) || equalii(D,N)) { set_avma(av2); continue; }
    1541             :       /* got something that works */
    1542        3279 :       if (DEBUGLEVEL >= 5)
    1543           0 :         err_printf("MPQS: splitting N after %ld kernel vector%s\n",
    1544             :                    i+1, (i? "s" : ""));
    1545        3279 :       gel(r,1) = diviiexact(N, D);
    1546        3279 :       gel(r,2) = D;
    1547        3279 :       rlast = rnext = 3;
    1548        3279 :       if (split(&gel(r,1), &c[1])) done++;
    1549        3279 :       if (split(&gel(r,2), &c[2])) done++;
    1550        3279 :       if (done == 2 || rmax == 2) break;
    1551         297 :       if (DEBUGLEVEL >= 5)
    1552           0 :         err_printf("MPQS: got two factors, looking for more...\n");
    1553             :     }
    1554             :     else
    1555             :     { /* we already have factors */
    1556        1912 :       for (j = 1; j < rnext; j++)
    1557             :       { /* loop over known-composite factors */
    1558             :         /* skip probable primes and also roots of pure powers: they are a lot
    1559             :          * smaller than N and should be easy to deal with later */
    1560        1369 :         if (c[j]) { done++; continue; }
    1561         543 :         av3 = avma; D = gcdii(X_plus_Y, gel(r,j));
    1562         543 :         if (is_pm1(D) || equalii(D, gel(r,j))) { set_avma(av3); continue; }
    1563             :         /* got one which splits this factor */
    1564         283 :         if (DEBUGLEVEL >= 5)
    1565           0 :           err_printf("MPQS: resplitting a factor after %ld kernel vectors\n",
    1566             :                      i+1);
    1567         283 :         gel(r,j) = diviiexact(gel(r,j), D);
    1568         283 :         gel(r,rnext) = D;
    1569         283 :         if (split(&gel(r,j), &c[j])) done++;
    1570             :         /* Don't increment done: happens later when we revisit c[rnext] during
    1571             :          * the present inner loop. */
    1572         283 :         (void)split(&gel(r,rnext), &c[rnext]);
    1573         283 :         if (++rnext > rmax) break; /* all possible factors seen */
    1574             :       } /* loop over known composite factors */
    1575             : 
    1576         543 :       if (rnext > rlast)
    1577             :       {
    1578         283 :         if (DEBUGLEVEL >= 5)
    1579           0 :           err_printf("MPQS: got %ld factors%s\n", rlast - 1,
    1580             :                      (done < rlast ? ", looking for more..." : ""));
    1581         283 :         rlast = rnext;
    1582             :       }
    1583             :       /* break out if we have rmax factors or all current factors are probable
    1584             :        * primes or tiny roots from pure powers */
    1585         543 :       if (rnext > rmax || done == rnext - 1) break;
    1586             :     }
    1587             :   }
    1588        3376 :   if (rnext == 1) return gc_NULL(av); /* no factors found */
    1589             : 
    1590             :   /* normal case: convert to ifac format as described in ifactor1.c (value,
    1591             :    * exponent, class [unknown, known composite, known prime]) */
    1592        3279 :   rlast = rnext - 1; /* # of distinct factors found */
    1593        3279 :   res = cgetg(3*rlast + 1, t_VEC);
    1594        3279 :   if (DEBUGLEVEL >= 6) err_printf("MPQS: wrapping up %ld factors\n", rlast);
    1595       10120 :   for (i = j = 1; i <= rlast; i++, j += 3)
    1596             :   {
    1597        6841 :     long C  = c[i];
    1598        6841 :     icopyifstack(gel(r,i), gel(res,j)); /* factor */
    1599        6841 :     gel(res,j+1) = C <= 1? gen_1: utoipos(C); /* exponent */
    1600        6841 :     gel(res,j+2) = C ? NULL: gen_0; /* unknown or known composite */
    1601        6841 :     if (DEBUGLEVEL >= 6)
    1602           0 :       err_printf("\tpackaging %ld: %Ps ^%ld (%s)\n", i, gel(r,i),
    1603           0 :                  itos(gel(res,j+1)), (C? "unknown": "composite"));
    1604             :   }
    1605        3279 :   return res;
    1606             : }
    1607             : 
    1608             : /*********************************************************************/
    1609             : /**               MAIN ENTRY POINT AND DRIVER ROUTINE               **/
    1610             : /*********************************************************************/
    1611             : static void
    1612           7 : toolarge()
    1613           7 : { pari_warn(warner, "MPQS: number too big to be factored with MPQS,\n\tgiving up"); }
    1614             : 
    1615             : static void
    1616           0 : mpqs_status(mpqs_handle_t *h, mpqs_FB_entry_t *FB)
    1617             : {
    1618           0 :   err_printf("MPQS: sieving interval = [%ld, %ld]\n", -(long)h->M, (long)h->M);
    1619             :   /* that was a little white lie, we stop one position short at the top */
    1620           0 :   err_printf("MPQS: size of factor base = %ld\n", (long)h->size_of_FB);
    1621           0 :   err_printf("MPQS: striving for %ld relations\n", (long)h->target_rels);
    1622           0 :   err_printf("MPQS: coefficients A will be built from %ld primes each\n",
    1623           0 :              (long)h->omega_A);
    1624           0 :   err_printf("MPQS: primes for A to be chosen near FB[%ld] = %ld\n",
    1625           0 :              (long)h->index2_FB, (long)FB[h->index2_FB].fbe_p);
    1626           0 :   err_printf("MPQS: smallest prime used for sieving FB[%ld] = %ld\n",
    1627           0 :              (long)h->index1_FB, (long)FB[h->index1_FB].fbe_p);
    1628           0 :   err_printf("MPQS: largest prime in FB = %ld\n", (long)h->largest_FB_p);
    1629           0 :   err_printf("MPQS: bound for `large primes' = %ld\n", (long)h->lp_bound);
    1630           0 :   if (DEBUGLEVEL >= 5)
    1631           0 :     err_printf("MPQS: sieve threshold = %u\n", (unsigned int)h->sieve_threshold);
    1632           0 :   err_printf("MPQS: computing relations:");
    1633           0 : }
    1634             : 
    1635             : /* Factors N using the self-initializing multipolynomial quadratic sieve
    1636             :  * (SIMPQS).  Returns one of the two factors, or (usually) a vector of factors
    1637             :  * and exponents and information about which ones are still composite, or NULL
    1638             :  * when we can't seem to make any headway. */
    1639             : GEN
    1640        3286 : mpqs(GEN N)
    1641             : {
    1642        3286 :   const long size_N = decimal_len(N);
    1643             :   mpqs_handle_t H;
    1644             :   GEN fact; /* will in the end hold our factor(s) */
    1645             :   mpqs_FB_entry_t *FB; /* factor base */
    1646             :   double dbg_target, DEFEAT;
    1647             :   ulong p;
    1648             :   pari_timer T;
    1649             :   hashtable lprel, frel;
    1650        3286 :   pari_sp av = avma;
    1651             : 
    1652        3286 :   if (DEBUGLEVEL >= 4) err_printf("MPQS: number to factor N = %Ps\n", N);
    1653        3286 :   if (size_N > MPQS_MAX_DIGIT_SIZE_KN) { toolarge(); return NULL; }
    1654        3279 :   if (DEBUGLEVEL >= 4)
    1655             :   {
    1656           0 :     timer_start(&T);
    1657           0 :     err_printf("MPQS: factoring number of %ld decimal digits\n", size_N);
    1658             :   }
    1659        3279 :   H.N = N;
    1660        3279 :   H.two_is_norm = 0; /* Not used */
    1661        3279 :   H.two_is_bad = 0;
    1662        3279 :   H.bin_index = 0;
    1663        3279 :   H.index_i = 0;
    1664        3279 :   H.index_j = 0;
    1665        3279 :   H.index2_moved = 0;
    1666        3279 :   p = mpqs_find_k(&H);
    1667        3279 :   if (p) return gc_utoipos(av,p);
    1668        3279 :   if (DEBUGLEVEL >= 5)
    1669           0 :     err_printf("MPQS: found multiplier %ld for N\n", H._k->k);
    1670        3279 :   H.kN = muliu(N, H._k->k);
    1671        3279 :   if (!mpqs_set_parameters(&H)) { toolarge(); return NULL; }
    1672             : 
    1673        3279 :   if (DEBUGLEVEL >= 5)
    1674           0 :     err_printf("MPQS: creating factor base and allocating arrays...\n");
    1675        3279 :   FB = mpqs_create_FB(&H, &p);
    1676        3279 :   if (p) return gc_utoipos(av, p);
    1677        3279 :   mpqs_sieve_array_ctor(&H);
    1678        3279 :   mpqs_poly_ctor(&H);
    1679             : 
    1680        3279 :   H.lp_bound = minss(H.largest_FB_p, MPQS_LP_BOUND);
    1681             :   /* don't allow large primes to have room for two factors both bigger than
    1682             :    * what the FB contains (...yet!) */
    1683        3279 :   H.lp_bound *= minss(H.lp_scale, H.largest_FB_p - 1);
    1684        3279 :   H.dkN = gtodouble(H.kN);
    1685             :   /* compute the threshold and fill in the byte-sized scaled logarithms */
    1686        3279 :   mpqs_set_sieve_threshold(&H);
    1687        3279 :   if (!mpqs_locate_A_range(&H)) return NULL;
    1688        3279 :   if (DEBUGLEVEL >= 4)
    1689           0 :     mpqs_status(&H, FB);
    1690             :   /* main loop which
    1691             :    * - computes polynomials and their zeros (SI)
    1692             :    * - does the sieving
    1693             :    * - tests candidates of the sieve array */
    1694             : 
    1695             :   /* Let (A, B_i) the current pair of coeffs. If i == 0 a new A is generated */
    1696        3279 :   H.index_j = (mpqs_uint32_t)-1;  /* increment below will have it start at 0 */
    1697             : 
    1698        3279 :   dbg_target = H.target_rels / 100.;
    1699        3279 :   DEFEAT = H.target_rels * 1.5;
    1700        3279 :   hash_init_GEN(&frel, H.target_rels, gequal, 1);
    1701        3279 :   hash_init_ulong(&lprel,H.target_rels, 1);
    1702             :   for(;;)
    1703      138793 :   {
    1704             :     long tc;
    1705             :     /* self initialization: compute polynomial and its zeros */
    1706      142072 :     if (!mpqs_self_init(&H, NULL))
    1707             :     { /* have run out of primes for A; give up */
    1708           0 :       if (DEBUGLEVEL >= 2)
    1709           0 :         err_printf("MPQS: Ran out of primes for A, giving up.\n");
    1710           0 :       return gc_NULL(av);
    1711             :     }
    1712      142072 :     mpqs_sieve(&H);
    1713      142072 :     tc = mpqs_eval_sieve(&H);
    1714      142072 :     if (DEBUGLEVEL >= 6)
    1715           0 :       err_printf("MPQS: found %lu candidate%s\n", tc, (tc==1? "" : "s"));
    1716      142072 :     if (tc)
    1717             :     {
    1718      132447 :       fact = mpqs_eval_cand(&H, tc, &frel, &lprel, MPQS_MODE_FACTOR);
    1719      132447 :       if (fact)
    1720             :       { /* factor found during combining */
    1721           0 :         if (DEBUGLEVEL >= 4)
    1722             :         {
    1723           0 :           err_printf("\nMPQS: split N whilst combining, time = %ld ms\n",
    1724             :                      timer_delay(&T));
    1725           0 :           err_printf("MPQS: found factor = %Ps\n", fact);
    1726             :         }
    1727           0 :         return gc_upto(av, fact);
    1728             :       }
    1729             :     }
    1730      142072 :     if (DEBUGLEVEL >= 4 && frel.nb > dbg_target)
    1731             :     {
    1732           0 :       err_printf(" %ld%%", 100*frel.nb/ H.target_rels);
    1733           0 :       if (DEBUGLEVEL >= 5) err_printf(" (%ld ms)", timer_delay(&T));
    1734           0 :       dbg_target += H.target_rels / 100.;
    1735             :     }
    1736      142072 :     if (frel.nb < (ulong)H.target_rels) continue; /* main loop */
    1737             : 
    1738        3376 :     if (DEBUGLEVEL >= 4)
    1739             :     {
    1740           0 :       timer_start(&T);
    1741           0 :       err_printf("\nMPQS: starting Gauss over F_2 on %ld distinct relations\n",
    1742             :                  frel.nb);
    1743             :     }
    1744        3376 :     fact = mpqs_solve_linear_system(&H, &frel);
    1745        3376 :     if (fact)
    1746             :     { /* solution found */
    1747        3279 :       if (DEBUGLEVEL >= 4)
    1748             :       {
    1749           0 :         err_printf("\nMPQS: time in Gauss and gcds = %ld ms\n",timer_delay(&T));
    1750           0 :         if (typ(fact) == t_INT) err_printf("MPQS: found factor = %Ps\n", fact);
    1751             :         else
    1752             :         {
    1753           0 :           long j, nf = (lg(fact)-1)/3;
    1754           0 :           err_printf("MPQS: found %ld factors =\n", nf);
    1755           0 :           for (j = 1; j <= nf; j++)
    1756           0 :             err_printf("\t%Ps%s\n", gel(fact,3*j-2), (j < nf)? ",": "");
    1757             :         }
    1758             :       }
    1759        3279 :       return gc_upto(av, fact);
    1760             :     }
    1761          97 :     if (DEBUGLEVEL >= 4)
    1762             :     {
    1763           0 :       err_printf("\nMPQS: time in Gauss and gcds = %ld ms\n",timer_delay(&T));
    1764           0 :       err_printf("MPQS: no factors found.\n");
    1765           0 :       if (frel.nb < DEFEAT)
    1766           0 :         err_printf("\nMPQS: restarting sieving ...\n");
    1767             :       else
    1768           0 :         err_printf("\nMPQS: giving up.\n");
    1769             :     }
    1770          97 :     if (frel.nb >= DEFEAT) return gc_NULL(av);
    1771          97 :     H.target_rels += 10;
    1772             :   }
    1773             : }
    1774             : 
    1775             : int
    1776          28 : mpqs_class_init(mpqs_handle_t *H, GEN D, long L)
    1777             : {
    1778             :   mpqs_FB_entry_t *FB; /* factor base */
    1779          28 :   ulong d = Mod16(D);
    1780          28 :   H->N = D;
    1781          28 :   H->two_is_norm = !(d==0 || d==4 || d==5 || d==13);
    1782          28 :   H->two_is_bad = d==0 || d==4;
    1783          28 :   H->bin_index = 0;
    1784          28 :   H->index_i = 0;
    1785          28 :   H->index_j = 0;
    1786          28 :   H->index2_moved = 0;
    1787          28 :   H->_k = &cand_multipliers[0];
    1788          28 :   H->kN = D;
    1789          28 :   if (!mpqs_set_parameters(H)) { toolarge(); return 0; }
    1790          28 :   H->size_of_FB = L + !H->two_is_norm;
    1791          28 :   if (DEBUGLEVEL >= 5)
    1792           0 :     err_printf("MPQS: creating factor base and allocating arrays...\n");
    1793          28 :   FB = mpqs_create_FB(H, NULL);
    1794          28 :   mpqs_sieve_array_ctor(H);
    1795          28 :   mpqs_poly_ctor(H);
    1796             : 
    1797             : #ifdef CLASSGROUP_LARGE_PRIME
    1798             :   H->lp_bound = minss(H->largest_FB_p, MPQS_LP_BOUND);
    1799             :   /* don't allow large primes to have room for two factors both bigger than
    1800             :    * what the FB contains (...yet!) */
    1801             :   H->lp_bound *= minss(H->lp_scale, H->largest_FB_p - 1);
    1802             : #else
    1803          28 :   H->lp_bound = H->largest_FB_p;
    1804             : #endif
    1805             : 
    1806          28 :   H->dkN = gtodouble(absi(H->kN));
    1807             :   /* compute the threshold and fill in the byte-sized scaled logarithms */
    1808          28 :   mpqs_set_sieve_threshold(H);
    1809          28 :   if (!mpqs_locate_A_range(H)) return 0;
    1810          28 :   if (DEBUGLEVEL >= 4)
    1811           0 :     mpqs_status(H, FB);
    1812          28 :   return 1;
    1813             : }
    1814             : 
    1815             : GEN
    1816         336 : mpqs_class_rels(mpqs_handle_t *H, ulong nb, GEN missing_primes)
    1817             : {
    1818         336 :   pari_sp av = avma;
    1819             :   pari_timer T;
    1820             :   double dbg_target;
    1821             :   hashtable lprel, frel;
    1822         336 :   H->index_j = (mpqs_uint32_t)-1;  /* increment below will have it start at 0 */
    1823         336 :   H->target_rels = nb;
    1824         336 :   if (DEBUGLEVEL >= 4)
    1825             :   {
    1826           0 :     err_printf("MPQS: striving for %ld relations\n", (long)H->target_rels);
    1827           0 :     err_printf("MPQS: computing relations:");
    1828             :   }
    1829         336 :   dbg_target = H->target_rels / 100.;
    1830         336 :   hash_init_GEN(&frel, H->target_rels, gequal, 1);
    1831         336 :   hash_init_ulong(&lprel,H->target_rels, 1);
    1832         336 :   if (DEBUGLEVEL >= 5)
    1833           0 :     timer_start(&T);
    1834         336 :   if (missing_primes && lg(missing_primes)==1)
    1835         119 :     missing_primes = NULL;
    1836         336 :   if (DEBUGLEVEL >= 5 && missing_primes)
    1837           0 :       err_printf("MPQS: %ld remaining primes\n",lg(missing_primes)-1);
    1838             :   for (;;)
    1839         847 :   {
    1840        1183 :     long tc, err = mpqs_self_init(H, missing_primes);
    1841        1183 :     if (err==0)
    1842             :     { /* have run out of primes for A; give up */
    1843           0 :       if (DEBUGLEVEL >= 2)
    1844           0 :         err_printf("MPQS: Ran out of primes for A, giving up.\n");
    1845           0 :       return NULL;
    1846        1183 :     } else if (err==-1)
    1847           0 :       return NULL;
    1848        1183 :     mpqs_sieve(H);
    1849        1183 :     tc = mpqs_eval_sieve(H);
    1850        1183 :     if (DEBUGLEVEL >= 6)
    1851           0 :       err_printf("MPQS: found %lu candidate%s\n", tc, (tc==1? "" : "s"));
    1852        1183 :     if (tc)
    1853        1183 :       mpqs_eval_cand(H, tc, &frel, &lprel, MPQS_MODE_CLASSGROUP);
    1854        1183 :     if (DEBUGLEVEL >= 4 && frel.nb > dbg_target)
    1855             :     {
    1856           0 :       err_printf(" %ld%%", 100*frel.nb/ H->target_rels);
    1857           0 :       if (DEBUGLEVEL >= 5) err_printf(" (%ld ms)", timer_delay(&T));
    1858           0 :       dbg_target += H->target_rels / 100.;
    1859             :     }
    1860        1183 :     if (frel.nb < nb) continue; /* main loop */
    1861         336 :     break;
    1862             :   }
    1863         336 :   if (DEBUGLEVEL >= 4) err_printf("\n");
    1864         336 :   return gc_GEN(av, rels_to_quad(H, hash_keys_GEN(&frel)));
    1865             : }

Generated by: LCOV version 1.16