Gonzalo Tornaria on Tue, 20 May 2003 11:21:26 -0500

 Sparse matrix multiplication // charpoly

```Idea:

If we have a "sparse" matrix, and we are going to do a lot of
multiplications on the left (right), we "study" it before, and replace
it by something else that can be used to multiply matrices on the left
(right). E.g.:

-------------------------------------------
? sl = matrix_sparseleft(m);
time = 3,810 ms.
? sparseleft_mul(sl, m);
time = 2,990 ms.
? m * m == %
time = 11,740 ms.
%22 = 1
-------------------------------------------

This is typical for 1000x1000 sparse matrices: it takes less time to
compute this way, even including the "study" time.

The "study" to prepare multiplication by the left consist in replacing
each row by a vector of pairs [col, entry], and multiplication by
another matrix amounts to take linear combinations of the rows according
to this vectors of pairs.

In my naive GP implementation (far from optimal), it seems to works
faster with multiplication by the right than the left... (which makes
sense, since matrices are organized by columns). That's also true for
the PARI multiplication.

To compute characteristic polynomials, we "study" the matrix once,
and then we use the studied one to multiply on the right.
This gives good results for big matrices.

I'll add the programs in next message.

----------------------------------------------

In what follows, a "SPARSE" n by n matrix  has about 4*n non-zero entries
(random integers from 1 to 10000), whereas a "DENSE" matrix has all
entries non-zero.  A "RIGHT" is a sparse matrix studied for
multiplication on the right, and a "LEFT" is a sparse matrix studied for
multiplication on the left.

CP(DENSE) is computing the charpoly for a dense matrix,
CP(SPARSE) is computing the charpoly for a sparse matrix;
for the GP program, it studies the matrix for the right once, and
then it takes advantage of the fast multiplication.

CP(SPARSE1) is charpoly for a sparse matrix whose entries are
non-negative integers that add up to 4 by columns (like the examples of the
previous message). The "generic" GP program runs almost as fast as the
"special purpose" GP program (for 4-regular graphs).

There isn't appreciable difference between multiplication by SPARSE and
SPARSE1. Why is there difference for computing the charpoly? I think is
because the coefficients of the intermediate matrices we compute
explode, and that's of course much much worse when the coefficients are
up to 10000 than when they are really small. In fact, I couldn't compute
the CP(SPARSE) for
500x500 (it runs out of memory), but I can compute CP(SPARSE1).

The timings are with the patched PARI (P4 2.0GHz)..

----------------------------------
100x100
---------

PARI: SPARSE * SPARSE :  17.3 msec
PARI: DENSE  * SPARSE :  20.3 msec
PARI: SPARSE * DENSE  :  94.0 msec
PARI: DENSE  * DENSE  : 160.0 msec

GP:   Study for right :  38.7 msec
GP:   SPARSE * RIGHT  :  18.4 msec
GP:   DENSE  * RIGHT  :  23.2 msec

GP:   Study for left  :  38.7 msec
GP:   LEFT   * SPARSE :  19.5 msec
GP:   LEFT   * DENSE  :  26.3 msec

PARI: CP(DENSE)       :  80.64 sec
PARI: CP(SPARSE)      :   3.69 sec
GP:   CP(SPARSE)      :   3.13 sec

PARI: CP(SPARSE1)     :   1.99 sec
GP:   CP(SPARSE1)     :   1.20 sec
----------------------------------
200x200
---------

PARI: SPARSE * SPARSE :  119.8 msec
PARI: DENSE  * SPARSE :  130.2 msec
PARI: SPARSE * DENSE  :  854.0 msec
PARI: DENSE  * DENSE  : 1530.0 msec

GP:   Study for right :  150.2 msec
GP:   SPARSE * RIGHT  :   70.0 msec
GP:   DENSE  * RIGHT  :   89.8 msec

GP:   Study for left  :  152.0 msec
GP:   LEFT   * SPARSE :   85.4 msec
GP:   LEFT   * DENSE  :  120.4 msec

PARI: CP(SPARSE)      :  57.50 sec
GP:   CP(SPARSE)      :  43.79 sec

PARI: CP(SPARSE1)     :  26.49 sec
GP:   CP(SPARSE1)     :   9.61 sec
----------------------------------
500x500
---------

PARI: SPARSE * SPARSE : 1773.0 msec
PARI: DENSE  * SPARSE : 1679.0 msec
PARI: SPARSE * DENSE  :16110.0 msec
PARI: DENSE  * DENSE  :26880.0 msec

GP:   Study for right :  919.0 msec
GP:   SPARSE * RIGHT  :  427.0 msec
GP:   DENSE  * RIGHT  :  547.0 msec

GP:   Study for left  :  943.0 msec
GP:   LEFT   * SPARSE :  669.0 msec
GP:   LEFT   * DENSE  :  892.0 msec

PARI: CP(SPARSE1)     : 887.21 sec
GP:   CP(SPARSE1)     : 175.12 sec
----------------------------------
1000x1000
---------

PARI: SPARSE * SPARSE :  12.1 sec
PARI: DENSE  * SPARSE :  12.7 sec
PARI: SPARSE * DENSE  : 162.8 sec
PARI: DENSE  * DENSE  : 251.1 sec

GP:   Study for right :   3.6 sec
GP:   SPARSE * RIGHT  :   1.6 sec
GP:   DENSE  * RIGHT  :   2.2 sec

GP:   Study for left  :   3.6 sec
GP:   LEFT   * SPARSE :   3.0 sec
GP:   LEFT   * DENSE  :   3.9 sec

PARI: CP(SPARSE1)     :????.?? sec
GP:   CP(SPARSE1)     :2021.34 sec
----------------------------------

NOTES:

0) Take all the number with a grain of salt, because this is not a
"serious" statistic, just sample timings.

0') The GP program is certainly not optimal, and I have not even
compiled it. A hand-made implementation in C should be even better.

0'') The PARI multiplication seems "to learn" (!), in fact, the first
time one multiplies takes longer. E.g. for 500x500 my data shows that
SPARSE*SPARSE takes more time than DENSE*SPARSE, but that seems to be
because I use the same matrix, 10 times for each, but the first time it
takes twice as much (!) So, put more salt on my numbers :-)

1) The current PARI implementation of multiplication takes good
advantage of sparseness if the *second* matrix is sparse (independent of
the first one), but not nearly as much if only the *first* one is
sparse. I guess that's unavoidable, unless one tests the first matrix
for sparseness and uses a the "other" algorithm (namely, m1*m2=(m2~*m1~)~)

2) For multiplication of 100x100 matrices, PARI is slightly better, even
without counting the "study time"; but GP is slightly better for
charpoly !!

3) For multiplication of 200x200 matrices, GP is better than PARI not
counting the study time. For charpoly, GP is better, but for SPARSE1
matrices, is even better (3x).

4) For multiplication of 500x500 matrices, GP is sligthly better
counting the study time (and 4x not counting it) !! CP is 5x better for SPARSE1 matrices.

5) For multiplication of 1000x1000 matrices, GP is 2x better than PARI
counting the study time, and 8x without study time.

Gonzalo
--
GM/CS/S d? a-- C++(+++)  UL+++(++++)  P++>+++  L+++>++++ E---
W-(+) N+(++) w--- O M-(--) V-(--) PGP(--) b++(+++) G++ e>++++

... A mathematician is a machine for converting coffee into theorems.
```