Karim Belabas on Fri, 21 Nov 2014 01:04:39 +0100

 Re: Growing ordered set

```* Bill Allombert [2014-11-20 23:29]:
> On Thu, Nov 20, 2014 at 01:41:36PM -0500, Charles Greathouse wrote:
> > Fairly often I have a problem of the type 'generate a number by a given
> > process, and repeat until a duplicate is found'. I also have the related
> > problem 'generate a number by a given process, and repeat until some limit,
> > recording the number of distinct elements at each step'.
> >
> >
> > I can think of several ways, none satisfactory:
> > * concat a vector, search linearly. O(n) to add, O(n) to search, O(n)
> > overall.
> > * listput a list, search linearly. O(1) to add, O(n) to search, O(n)
> > overall.
> > * listput then listsort, setsearch. O(n log n) to add, O(log n) to search,
> > O(n log n) overall.
> > * setsearch(,,1) and listinsert. O(n) to add, O(log n) to search, O(n)
> > overall.
>
> I am not quite sure what complexity mode you choose for your O(), but
> the last one is not that bad, compared to the cost induced by the GP memory
> manager...
>
> Anyway, yes you can do better, using binary search tree!
> See the attached (rather crude) script.
>
> binary search tree:
>
> ? setrand(1);test(10^11)[1]
> %1 = 537777
> ? ##
>   ***   last result computed in 9,325 ms.
>
> listinsert:
>
> ? setrand(1);test2(10^11)[1]
> %2 = 537777
> ? ##
>   ***   last result computed in 43,584 ms.

That's beautiful, indeed, but...

(00:48) gp > install(test3,lG)
(00:48) gp > setrand(1); test3(10^11)
time = 220 ms.
%2 = 537777

long
test3(GEN N)
{
pari_sp av = avma;
hashtable *h = hash_create(600000,
(ulong (*)(void*))hash_GEN,
(int (*)(void*,void*))gidentical, 1);
long c = 0;

while(1)
{
void *k = (void*)randomi(N);
c++;
if (hash_search(h, k)) { avma = av; return c; }
hash_insert(h, k, NULL);
}
}

(N.B. my machine is a little slower than yours, I need about 11s for test() and
47s for test2.)

Cheers,

K.B.
--
Karim Belabas, IMB (UMR 5251)  Tel: (+33) (0)5 40 00 26 17
Universite de Bordeaux         Fax: (+33) (0)5 40 00 69 50
351, cours de la Liberation    http://www.math.u-bordeaux1.fr/~kbelabas/
F-33405 Talence (France)       http://pari.math.u-bordeaux1.fr/  [PARI/GP]
`

```