Peter Bruin on Thu, 07 May 2015 17:07:12 +0200

[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Re: Strassen multiplication over the integers

Hello Bill,

>> >> > - Most of the code is fairly generic, so maybe there could be a
>> >> > gen_matmul_sw function.
>> I have now implemented this; see the attached patch.  It is
>> completely analogous to ZM_mul_sw.  There are some new tests to
>> ensure that the new code is covered.
> Did you test whether ZM_mul_sw was faster than gen_matmul_sw (with the
> same tuning) ?

I haven't compared the two functions over Z; I would personally prefer a
specialised ZM_mul that is as fast as possible, rather than implementing
it via gen_matmul_sw.  I do see that this is not very desirable from the
perspective of code duplication, though.

> One would need to write a bb_field instance for Z (the inv field is
> not used).

In fact, the operations that one needs for Strassen's algorithm don't
correspond directly to the set of operations defined by a bb_field: one
needs addition, subtraction, negation, zero element, and basecase matrix
multiplication (either subtraction or negation might be implemented via
the other, given addition and zero element).

> It is unfortunate that gen_matmul() takes a bb_field because it is
> also valid over rings. I am not sure what is the best way to handle
> that.

Maybe the bb_field and bb_ring structures could eventually be merged,
allowing the inv member to raise an error when it is applied to a
non-unit element?

Actually, I'm still unsure for what kind of applications we would really
want to have gen_matmul_sw.  At least in the cases where bb_field is
currently used (non-prime finite fields), Kronecker representation
combined with Strassen multiplication over Z is certainly much faster
than Strassen over the original field.  I would expect it to be up to
about 4 times as fast for matrices of size around 100, based on the
speedup it gives in my project and on the improvements I got while
implementing Kronecker substitution for classical matrix multiplication.
Also, ZM_mul_sw is definitely easier to tune.