AFSK  Power Factor Sum Sum (hard)
Here is a mixed edition of Divisor Summation Powered and Amazing Factor Sequence (medium).
The powered factor sequence
For k an integer number, we define our powered factor sequence with:
a_{k}[0] = 0; a_{k}[1] = 1, and
for n > 1, a_{k}[n] = a_{k}[n  1] + sum({x^k  0 < x ≤ n and n % x = 0}).
Input
First line of input contains an integer T, the number of test cases.
Each of the next T lines contains three integers n, k, m.
Output
For each test case, print a_{k}[n] on a single line.
As the answer could be a big number, you just have to output it modulo m.
Example
Input: 3 3 1 10 4 2 55 5 3 97
Output: 8 37 43
Constraints
0 < T < 101 0 < n < 10^9 0 < k < 11 1 < m < 10^17
Numbers n, k, m are uniformrandomly chosen.
For your information, there's two input files, the first one is 'easy' with n≤100.
My (1kB)python code get AC around 0.96s. I have a much slower basic PIKE AC (4.8s).
(Edit 20170211 ; timings and TL updated after compiler changes)
hide comments
mike:
20170806 21:50:52
Last edit: 20170807 09:12:15 

varun bumb:
20150722 19:26:26
Last edit: 20190428 06:13:02 

[Lakshman]:
20141115 19:57:55
Finally Accepted !!!!!!!!! after such a long time.


Sandeep Pathry:
20130621 08:46:18
@francky... got AC...


fR0DDY:
20130619 17:30:12
Again need help in bringing it closer to at least 5 secs. It's 10 times slower than your best. :(


Thomas Dybdahl Ahle:
20130507 13:09:58
I don't get it. Isn't this exactly equal to calculating G(0,n,k) in IITD4?


maaz:
20130427 16:30:43
with the given constraints i must say a really good question.. 

[Lakshman]:
20130426 09:11:08
Why WA...It should be TLE but here I am getting WA...


Michael Kharitonov:
20130321 10:06:43
It's brilliant! When I publish empty comment, it says "Comment too long!"


[Rampage] Blue.Mary:
20130320 14:04:25
@Francky: Is my current C++ solution the expected one?

Added by:  Francky 
Date:  20130317 
Time limit:  15s 
Source limit:  50000B 
Memory limit:  1536MB 
Cluster:  Cube (Intel G860) 
Languages:  All except: ASM64 