Reputation: 59
Alice has N coins of amount from 0 to (N-1). Bob wants to take k coins out of them, but Alice will only give if the set of K coins is interesting.
A set of coins is interesting if the sum of them is divisible by a unique integer M. Now Bob wants to know in how many ways he can get K coins.
Print the result by answer%(10^9+7)
Input format:- Three space separated integers N,K,M
Constraints:-
- 1 <= N, M <= 10 ^ 3
- 1 <= K <= 10 ^ 2
Sample input:- 4 2 2
Sample output:- 2({1,3},{2,4})
I tried solving the problem using combinations in python libraries but it resulted the Memory limit exceeded. Later I used the recursive method to it but it also resulted the Time limit exceeded. as it took 10 sec time for each private test cases.
Can anyone help in solving this effective manner?
Code of the recursion method:
cou=0
n,k,m = input().split()
out_ = solve(k,m,n)
print(out_)
def getCombinations(data,n,k,m):
val=[0]*k
combiUtil(data,n,k,0,val,0,m)
def combiUtil(data,n,k,ind,val,i,m):
global cou
if(ind==k):
if(sum(val)%m==0):
cou+=1
return
if(i>=n):
return
val[ind]=data[i]
combiUtil(data,n,k,ind+1,val,i+1,m)
combiUtil(data,n,k,ind,val,i+1,m)
def solve(k,m,n):
global cou
k=int(k)
m=int(m)
n=int(n)
ans =0
data = [j for j in range(1,n+1)]
getCombinations(data,n,k,m)
return cou%(10**9+7)
Upvotes: 5
Views: 3167
Reputation: 10153
Three NumPy versions of @kcsquared's solution that easily solve the worst case in well under the 10 seconds time limit:
def numpy1(n, k, m):
dp = np.zeros((k+1, m), np.int32)
dp[0][0] = 1
for i in range(1, n+1):
dp[1:,] += dp[:-1, (np.arange(m) + i) % m]
dp %= 10**9 + 7
return dp[k][0]
def numpy2(n, k, m):
dp = np.zeros((k+1, m), np.int32)
dp[0][0] = 1
i = range(m)
for _ in range(n):
i = np.roll(i, 1)
dp[1:,] += dp[:-1, i]
dp %= 10**9 + 7
return dp[k][0]
def numpy3(n, k, m):
dp = np.zeros((k+1, m), np.int32)
dp[0][0] = 1
for i in range(n):
dp[1:,] += np.roll(dp[:-1,], i, axis=1)
dp %= 10**9 + 7
return dp[k][0]
Benchmarks for small, medium and worst case:
n = 19 k = 11 m = 13
-------------------------------------------------------
22856.8 μs 23409.3 μs 23421.4 μs naive
496.9 μs 500.2 μs 524.7 μs dtjc
918.6 μs 928.6 μs 936.3 μs kcsquared
173.5 μs 183.6 μs 191.9 μs numpy1
402.2 μs 403.5 μs 411.1 μs numpy2
297.8 μs 318.4 μs 320.1 μs numpy3
n = 200 k = 100 m = 200
-------------------------------------------------------
2033.6 ms 2177.3 ms 2178.1 ms dtjc
1410.6 ms 1420.2 ms 1430.5 ms kcsquared
19.5 ms 19.8 ms 20.3 ms numpy1
22.5 ms 22.9 ms 23.0 ms numpy2
26.8 ms 27.3 ms 27.3 ms numpy3
n = 1000 k = 100 m = 1000
-------------------------------------------------------
508.0 ms 516.1 ms 519.2 ms numpy1
518.3 ms 518.8 ms 526.3 ms numpy2
495.1 ms 496.4 ms 499.2 ms numpy3
Benchmark code (Try it online!):
from timeit import repeat
from itertools import combinations
from functools import lru_cache
import numpy as np
def naive(n, k, m):
return sum(sum(combi) % m == 0
for combi in combinations(range(1, n+1), k)) % (10**9 + 7)
@lru_cache(None)
def dtjc(n, k, m, r=0):
if k > n:
return 0
if k == 0:
return 1 if r == 0 else 0
return (dtjc(n-1, k, m, r) + dtjc(n-1, k-1, m, (r+n) % m)) % (10**9 + 7)
def kcsquared(max_part_size: int, total_num_parts: int, mod: int) -> int:
BIG_MOD = 10 ** 9 + 7
# Optimization if no partitions exist
if total_num_parts > max_part_size:
return 0
largest_sum = ((max_part_size * (max_part_size + 1) // 2)
- ((max_part_size - total_num_parts) *
(max_part_size - total_num_parts + 1) // 2))
# Optimization if largest partition sum still smaller than mod
if largest_sum < mod:
return 0
dp = [[0 for _ in range(mod)] for _ in range(total_num_parts + 1)]
dp[0][0] = 1
for curr_max_part in range(1, max_part_size + 1):
for curr_num_parts in reversed(range(0, total_num_parts)):
for rem in range(mod):
dp[curr_num_parts + 1][(rem + curr_max_part) % mod] += dp[curr_num_parts][rem]
dp[curr_num_parts + 1][(rem + curr_max_part) % mod] %= BIG_MOD
return dp[total_num_parts][0]
def numpy1(n, k, m):
dp = np.zeros((k+1, m), np.int32)
dp[0][0] = 1
for i in range(1, n+1):
dp[1:,] += dp[:-1, (np.arange(m) + i) % m]
dp %= 10**9 + 7
return dp[k][0]
def numpy2(n, k, m):
dp = np.zeros((k+1, m), np.int32)
dp[0][0] = 1
i = range(m)
for _ in range(n):
i = np.roll(i, 1)
dp[1:,] += dp[:-1, i]
dp %= 10**9 + 7
return dp[k][0]
def numpy3(n, k, m):
dp = np.zeros((k+1, m), np.int32)
dp[0][0] = 1
for i in range(n):
dp[1:,] += np.roll(dp[:-1,], i, axis=1)
dp %= 10**9 + 7
return dp[k][0]
def test(args, solutions, number, format_time):
print('n = %d k = %d m = %d' % args)
print('-' * 55)
for _ in range(1):
results = set()
for func in solutions:
times = sorted(repeat(lambda: dtjc.cache_clear() or results.add(func(*args)), number=number))[:3]
print(*(format_time(t / number) for t in times), func.__name__)
print('results set:', results)
assert len(results) == 1
print()
test((19, 11, 13),
[naive, dtjc, kcsquared, numpy1, numpy2, numpy3],
10,
lambda time: '%7.1f μs ' % (time * 1e6))
test((200, 100, 200),
[dtjc, kcsquared, numpy1, numpy2, numpy3],
1,
lambda time: '%6.1f ms ' % (time * 1e3))
test((1000, 100, 1000),
[numpy1, numpy2, numpy3],
1,
lambda time: '%5.1f ms ' % (time * 1e3))
Upvotes: 1
Reputation: 5409
If you look at the time complexity for the 'try all combinations' brute force solution, it's equal to O((N choose K) * K) = O(K * N^K
), since there are N choose K
ways to pick K
distinct integers from 1 to N
inclusive, and evaluating their sum takes K-1
additions. For all but tiny values of N
and K
, this is astronomically large.
Better Solution: Dynamic Programming
A much faster and slightly simpler solution is dynamic programming. We can write this as a 3D dynamic programming problem:
Let dp[i][j][r], 0 <= i <= N; 0 <= j <= K; 0 <= r < M
be the number of combinations of j ints from [1, 2, ..., i]
with sum congruent to r modulo M. We want dp[N][K][0]
dp[i][j][r] = 1 if i == j == r == 0
0 if i == 0 and (j /= 0 or r /= 0)
1 if j == 0 and r == 0
0 if j == 0 and r /= 0
dp[i-1][j][r] + dp[i-1][j-1][(r-i) % M] otherwise
There's a lot of edge cases added into the formula, but the most important aspect is the last case: Our Dynamic Programming subproblem depends on at most 2 other subproblems, so the total runtime is the size of our DP array: O(nmk)
. Here's a Python implementation:
def get_combinations_dp(max_part_size: int, total_num_parts: int, mod: int) -> int:
BIG_MOD = 10 ** 9 + 7
# Optimization if no partitions exist
if total_num_parts > max_part_size:
return 0
largest_sum = ((max_part_size * (max_part_size + 1) // 2)
- ((max_part_size - total_num_parts) *
(max_part_size - total_num_parts + 1) // 2))
# Optimization if largest partition sum still smaller than mod
if largest_sum < mod:
return 0
dp = [[0 for _ in range(mod)] for _ in range(total_num_parts + 1)]
dp[0][0] = 1
for curr_max_part in range(1, max_part_size + 1):
for curr_num_parts in reversed(range(0, total_num_parts)):
for rem in range(mod):
dp[curr_num_parts + 1][(rem + curr_max_part) % mod] += dp[curr_num_parts][rem]
dp[curr_num_parts + 1][(rem + curr_max_part) % mod] %= BIG_MOD
return dp[total_num_parts][0]
The parameters are N, K, M
, renamed to max_part_size, total_num_parts, mod
, with some optional pre-checks to return 0
immediately if there's no partition.
Now, suppose you want to do better than O(nmk)
. Here, things get tricky. If you want to do strictly better, the only way I can imagine is to find the generating function for these partitions, and use FFT or some other fast polynomial multiplication modulo 10**9 + 7
. For a start on researching how to do this, I'd recommend this thread on the math stackexchange, which pertains to counting precisely these partitions in terms of more well known partition numbers, whose generating functions are already known. Even so, I was unable to find any mention of whether this generating function has a short representation, and using partition numbers directly can't improve on the O(nmk)
complexity.
Using combinatorics
If you still want to use this dynamic programming approach, there's a small modification using combinatorics which may be asymptotically faster when N
is larger than M*K
: it runs in time O((M*K)^2)
, which doesn't depend on N
. The idea is to use our DP formula, but instead of choosing K distinct integers from [1, ... N]
, we now choose K (possibly repeated) residue classes from [0, ... M-1]
.
How does this work? First, we need to count how many ints in [1, ... N]
fall into each residue class i mod M
. Call this number R[i]
, for 0 <= i < M
. You can calculate this as
R[i] = floor(N/M) + (1 if 0 < i <= N%M else 0)
Now we can write a slightly modified Dynamic Programming definition and formula:
Let dp[i][j][r], 0 <= i <= M; 0 <= j <= K; 0 <= r < M
be the number of combinations with replacement of j ints from
residue classes [0, 1, ... i-1] with sum congruent to r modulo M.
We want dp[M][K][0]:
dp[i][j][r] = 1 if i == j == r == 0
0 if i == 0 and (j /= 0 or r /= 0)
0 if i < 0 or j < 0
F(i, j, r) otherwise
F(i, j, r) = Sum from p = 0 to min(R[i], j) of:
(R[i] choose p) * dp[i-1][j-p][(r - i*p) % M]
Upvotes: 6
Reputation: 520
I hope you might have already solved the problem. But still, I'm answering this question for people who find it helpful.
You have tried to get the combinations on your own, but you could've used a library instead to get all the possible combinations and simply iterate through and check for the condition. If the purpose is not only for learning, it's always ok to use already available code.
Anyways, have a look at the code. Thanks.
from itertools import combinations
def getCombi(n, k, m):
count = 0
#Required Combinations
reqcombis = []
array = [i for i in range(1, n+1)]
#getting all possible combinations
totalcombins = combinations(array, k)
for i in totalcombins:
if sum(i) % m == 0 and sum(i) <= n:
count+=1
reqcombis.append(i)
return count, reqcombis
if __name__ == "__main__":
n, k, m = input().split(",")
n, k, m = int(n), int(k), int(m)
print(getCombi(n, k, m))
Upvotes: 0