MATRMUL - Matrix Multiplication 4K


Consider the following C++ code, it constructs n×n matrices A,B and vector V from matrix C=A×B which you need to calculate.

uint32_t n, i, j, d1, p1, r1, m1, d2, p2, m2, A[n][n], B[n][n];
uint64_t C[n][n], V[n];
//here you need to read n, p1, d1, r1, m1, p2, d2, r2, m2 from input.
for (i=0; i<n; ++i)
    for (j=0; j<n; ++j) {
        d1 = d1 * p1 + r1;
        d2 = d2 * p2 + r2;
        A[i][j] = d1 >> (32 - m1);
        B[i][j] = d2 >> (32 - m2);
    }
//here you need to calculate C=A*B
for (i=0; i<n; ++i) {
    V[i] = 0;
    for (j=0; j<n; ++j)
        V[i] ^= C[i][j];
}
//here you need to output V[0], V[1], ..., V[n-1], separated by spaces.

Input

You are given integers n, p1, d1, r1, m1, p2, d2, r2, m2, separated by spaces.

1≤n≤4096, 1≤m1,m2≤26, 0≤p1,d1,r1,p2,d2,r2<232.

Output

You need to output V[0], ..., V[n-1], separated by spaces.

Example

Input:
4 1664525 0 1013904223 26 1664525 1 1013904223 26
Output:
3929555766216722 770418013451752 7738542081270672 4286515685761206

P.S.

It is recommended to solve MATRMUL0 before this task, to get AC here your time there should be less than 14s for cubic algorithm and 16s for Strassen's.

There are 4 tests with n≈4096. Score is average time in seconds, lower is better. My best score — ~5.6s which is 5 times faster than time limit.


hide comments
Michael Kharitonov: 2017-12-31 12:17:52

tip: use Function Attributes or Function Specific Option Pragmas:
https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html
https://gcc.gnu.org/onlinedocs/gcc/Function-Specific-Option-Pragmas.html

Last edit: 2018-07-25 22:34:23

Added by:Michael Kharitonov
Date:2017-04-08
Time limit:28s
Source limit:50000B
Memory limit:1536MB
Cluster: Cube (Intel G860)
Languages:All
Resource:fastcomputing.org