CodeWalrus

General => Other => Topic started by: Zeda on June 26, 2018, 06:36:55 pm

Title: Topic: Fast 3-digit Squaring, Toom-Cook
Post by: Zeda on June 26, 2018, 06:36:55 pm
Hi all, I realized that I have a lot of unshared results that I've worked out over the years! (Then again, I have a lot of shared results :P )
Today's is dealing with squaring 3-digit numbers.
If you aren't familiar with the Toom-Cook algorithm, read the spoiler! It's cool stuff. Otherwise, continue down.
(click to show/hide)
To square a 3-digit number, we'll result in a degree-4 polynomial. Consequently, we'll need to evaluate at 5 points. The points I chose were {0,1,i,-1,-i} where 'i' is the imaginary unit. It may seem like evaluating at a complex point would be, well, complex, but we'll see. First, let's write our polynomial as a0+a1x+a2x2.
Then:
b0=f(0)=a02
b1=f(1)=(a0+a1+a2)2
b2=f(i)=(a0+a1i-a2)2
b3=f(-1)=(a0-a1+a2)2
b4=f(-i)=(a0-ia1-a2)2
Squaring complex numbers is easy though-- (a+bi)2= a2-b2+2abi = (a-b)(a+b)+2abi
So simplifying:
b2=(a0-a2+a1)(a0-a2-a1)+2i(a0-a2)a1
b4=(a0-a2+a1)(a0-a2-a1)-2i(a0-a2)a1

Noticing symmetry, let's define instead:
c3=(a0-a2+a1)(a0-a2-a1)
c4=2(a0-a2)a1

Then:
b2=c3+ic4
b4=c3-ic4


    1       0       0       0       0   b0
    1       1       1       1       1   b1
    1       i      -1      -i       1   c3+ic4
    1      -1       1      -1       1   b3
    1      -i      -1       i       1   c3-ic4


    1       0       0       0       0   b0
    1       0       1       0       1   (b1+b3)/2 = c1
=>  0       1       0       1       0   (b1-b3)/2 = c2
    1       0      -1       0       1   c3
    0       1       0      -1       0   c4


    1       0       0       0       0   b0
    1       0       0       0       1   (c1+c3)/2 = d1
=>  0       0       1       0       0   (c1-c3)/2 = d2
    0       1       0       0       0   (c2+c4)/2 = d3
    0       0       0       1       0   (c2-c4)/2 = d4


Then our polynomial is:
b0+d3x+d2x2+d4x3+(d1-b0)x4


So for an example, let's find 348*348
Then we have:
a0=8
a1=4
a2=3
b0=64
b1=(8+4+3)2=225
b3=(8-4+3)2=49
c3=(8-3+4)(8-3-4)=9
c4=2(8-3)*4=40
c1=(225+49)/2 = 137
c2=c1-49 = 88
d1=(137+9)/2 = 73
d2=d1-9 = 64
d3=(88+40)/2 = 64
d4=d3-40 = 24

So then, our result is:
64+64x+64x2+24x3+9x4
Evaluating at x=10 eyelids yields:
64+640+6400+24000+90000 = 121104, which is indeed 384*384

Now let's generate some Python code:
Code: [Select]
def sqr3(a,b,c):
    d=a+b+c    #roughly 75% of cases exceed the original bit-width
    d*=d
    e=a-b+c    #roughly 1/6 of the time this will exceed the original bit-width. However, whenever this exceeds the bit-width, then d also exceeds it.
    e*=e
    d=(d+e)/2
    e=d-e
    f=(a+b-c)*(a-b-c)   #it is never the case that both multiplicands exceed  their original word width, but about 1/3 of the time, one of them exceeds it (~2/3 of the time, neither exceed).
    b*=(a-c)        #neither multiplicand exceeds the original bit-width (though the second operand may become negative).
    b+=b
    a*=a
    e=(e+b)/2
    b=e-b
    d=(d-f)/2
    f+=d-a
    return [a,e,d,b,f]
Note that three of the sub-multiplications are just squaring the input. Also note that all divisions are by a power of two, and all results and operations are integer operations, not complex (assuming all inputs are integers).