chandra sekhar
chandra sekhar

Reputation: 59

finding the count of number of sub arrays of size K whose sum is divisible by M?

Alice has N coins of amount from 0 to (N-1). Bob wants to take k coins out of them, but Alice will only give if the set of K coins is interesting.

A set of coins is interesting if the sum of them is divisible by a unique integer M. Now Bob wants to know in how many ways he can get K coins.

Print the result by answer%(10^9+7)

Input format:- Three space separated integers N,K,M

Constraints:-

  • 1 <= N, M <= 10 ^ 3
  • 1 <= K <= 10 ^ 2

Sample input:- 4 2 2

Sample output:- 2({1,3},{2,4})

I tried solving the problem using combinations in python libraries but it resulted the Memory limit exceeded. Later I used the recursive method to it but it also resulted the Time limit exceeded. as it took 10 sec time for each private test cases.

Can anyone help in solving this effective manner?

Code of the recursion method:

enter image description here

cou=0
n,k,m = input().split()
out_ = solve(k,m,n)
print(out_)


def getCombinations(data,n,k,m):
    val=[0]*k
    combiUtil(data,n,k,0,val,0,m)

def combiUtil(data,n,k,ind,val,i,m):
    global cou
    if(ind==k):
        if(sum(val)%m==0):
            cou+=1
        return
    if(i>=n):
        return
    val[ind]=data[i]
    combiUtil(data,n,k,ind+1,val,i+1,m)
    combiUtil(data,n,k,ind,val,i+1,m)

def solve(k,m,n):
    global cou
    k=int(k)
    m=int(m)
    n=int(n)
    ans =0
    data = [j for j in range(1,n+1)]
    getCombinations(data,n,k,m)   
    return cou%(10**9+7)

Upvotes: 5

Views: 3167

Answers (3)

no comment
no comment

Reputation: 10153

Three NumPy versions of @kcsquared's solution that easily solve the worst case in well under the 10 seconds time limit:

def numpy1(n, k, m):
    dp = np.zeros((k+1, m), np.int32)
    dp[0][0] = 1
    for i in range(1, n+1):
        dp[1:,] += dp[:-1, (np.arange(m) + i) % m]
        dp %= 10**9 + 7
    return dp[k][0]
def numpy2(n, k, m):
    dp = np.zeros((k+1, m), np.int32)
    dp[0][0] = 1
    i = range(m)
    for _ in range(n):
        i = np.roll(i, 1)
        dp[1:,] += dp[:-1, i]
        dp %= 10**9 + 7
    return dp[k][0]
def numpy3(n, k, m):
    dp = np.zeros((k+1, m), np.int32)
    dp[0][0] = 1
    for i in range(n):
        dp[1:,] += np.roll(dp[:-1,], i, axis=1)
        dp %= 10**9 + 7
    return dp[k][0]

Benchmarks for small, medium and worst case:

n = 19   k = 11   m = 13
-------------------------------------------------------
22856.8 μs  23409.3 μs  23421.4 μs  naive
  496.9 μs    500.2 μs    524.7 μs  dtjc
  918.6 μs    928.6 μs    936.3 μs  kcsquared
  173.5 μs    183.6 μs    191.9 μs  numpy1
  402.2 μs    403.5 μs    411.1 μs  numpy2
  297.8 μs    318.4 μs    320.1 μs  numpy3

n = 200   k = 100   m = 200
-------------------------------------------------------
2033.6 ms  2177.3 ms  2178.1 ms  dtjc
1410.6 ms  1420.2 ms  1430.5 ms  kcsquared
  19.5 ms    19.8 ms    20.3 ms  numpy1
  22.5 ms    22.9 ms    23.0 ms  numpy2
  26.8 ms    27.3 ms    27.3 ms  numpy3

n = 1000   k = 100   m = 1000
-------------------------------------------------------
508.0 ms  516.1 ms  519.2 ms  numpy1
518.3 ms  518.8 ms  526.3 ms  numpy2
495.1 ms  496.4 ms  499.2 ms  numpy3

Benchmark code (Try it online!):

from timeit import repeat
from itertools import combinations
from functools import lru_cache
import numpy as np

def naive(n, k, m):
    return sum(sum(combi) % m == 0
               for combi in combinations(range(1, n+1), k)) % (10**9 + 7)

@lru_cache(None)
def dtjc(n, k, m, r=0):
    if k > n:
        return 0
    if k == 0:
        return 1 if r == 0 else 0
    return (dtjc(n-1, k, m, r) + dtjc(n-1, k-1, m, (r+n) % m)) % (10**9 + 7)

def kcsquared(max_part_size: int, total_num_parts: int, mod: int) -> int:
    BIG_MOD = 10 ** 9 + 7

    # Optimization if no partitions exist
    if total_num_parts > max_part_size:
        return 0

    largest_sum = ((max_part_size * (max_part_size + 1) // 2)
                   - ((max_part_size - total_num_parts) *
                      (max_part_size - total_num_parts + 1) // 2))
    # Optimization if largest partition sum still smaller than mod
    if largest_sum < mod:
        return 0

    dp = [[0 for _ in range(mod)] for _ in range(total_num_parts + 1)]
    dp[0][0] = 1

    for curr_max_part in range(1, max_part_size + 1):
        for curr_num_parts in reversed(range(0, total_num_parts)):
            for rem in range(mod):
                dp[curr_num_parts + 1][(rem + curr_max_part) % mod] += dp[curr_num_parts][rem]
                dp[curr_num_parts + 1][(rem + curr_max_part) % mod] %= BIG_MOD
    return dp[total_num_parts][0]

def numpy1(n, k, m):
    dp = np.zeros((k+1, m), np.int32)
    dp[0][0] = 1
    for i in range(1, n+1):
        dp[1:,] += dp[:-1, (np.arange(m) + i) % m]
        dp %= 10**9 + 7
    return dp[k][0]

def numpy2(n, k, m):
    dp = np.zeros((k+1, m), np.int32)
    dp[0][0] = 1
    i = range(m)
    for _ in range(n):
        i = np.roll(i, 1)
        dp[1:,] += dp[:-1, i]
        dp %= 10**9 + 7
    return dp[k][0]

def numpy3(n, k, m):
    dp = np.zeros((k+1, m), np.int32)
    dp[0][0] = 1
    for i in range(n):
        dp[1:,] += np.roll(dp[:-1,], i, axis=1)
        dp %= 10**9 + 7
    return dp[k][0]

def test(args, solutions, number, format_time):
    print('n = %d   k = %d   m = %d' % args)
    print('-' * 55)
    for _ in range(1):
        results = set()
        for func in solutions:
            times = sorted(repeat(lambda: dtjc.cache_clear() or results.add(func(*args)), number=number))[:3]
            print(*(format_time(t / number) for t in times), func.__name__)
        print('results set:', results)
        assert len(results) == 1
        print()

test((19, 11, 13),
     [naive, dtjc, kcsquared, numpy1, numpy2, numpy3],
     10,
     lambda time: '%7.1f μs ' % (time * 1e6))
test((200, 100, 200),
     [dtjc, kcsquared, numpy1, numpy2, numpy3],
     1,
     lambda time: '%6.1f ms ' % (time * 1e3))
test((1000, 100, 1000),
     [numpy1, numpy2, numpy3],
     1,
     lambda time: '%5.1f ms ' % (time * 1e3))

Upvotes: 1

kcsquared
kcsquared

Reputation: 5409

If you look at the time complexity for the 'try all combinations' brute force solution, it's equal to O((N choose K) * K) = O(K * N^K), since there are N choose K ways to pick K distinct integers from 1 to N inclusive, and evaluating their sum takes K-1 additions. For all but tiny values of N and K, this is astronomically large.

Better Solution: Dynamic Programming

A much faster and slightly simpler solution is dynamic programming. We can write this as a 3D dynamic programming problem:

Let dp[i][j][r], 0 <= i <= N; 0 <= j <= K; 0 <= r < M
be the number of combinations of j ints from [1, 2, ..., i] 
with sum congruent to r modulo M. We want dp[N][K][0]

dp[i][j][r] = 1 if i == j == r == 0
              0 if i == 0 and (j /= 0 or r /= 0)
              1 if j == 0 and r == 0
              0 if j == 0 and r /= 0
              dp[i-1][j][r] + dp[i-1][j-1][(r-i) % M] otherwise

There's a lot of edge cases added into the formula, but the most important aspect is the last case: Our Dynamic Programming subproblem depends on at most 2 other subproblems, so the total runtime is the size of our DP array: O(nmk). Here's a Python implementation:

def get_combinations_dp(max_part_size: int, total_num_parts: int, mod: int) -> int:
    BIG_MOD = 10 ** 9 + 7

    # Optimization if no partitions exist
    if total_num_parts > max_part_size:
        return 0

    largest_sum = ((max_part_size * (max_part_size + 1) // 2)
                   - ((max_part_size - total_num_parts) *
                      (max_part_size - total_num_parts + 1) // 2))
    # Optimization if largest partition sum still smaller than mod
    if largest_sum < mod:
        return 0

    dp = [[0 for _ in range(mod)] for _ in range(total_num_parts + 1)]
    dp[0][0] = 1

    for curr_max_part in range(1, max_part_size + 1):
        for curr_num_parts in reversed(range(0, total_num_parts)):
            for rem in range(mod):
                dp[curr_num_parts + 1][(rem + curr_max_part) % mod] += dp[curr_num_parts][rem]
                dp[curr_num_parts + 1][(rem + curr_max_part) % mod] %= BIG_MOD
    return dp[total_num_parts][0]

The parameters are N, K, M, renamed to max_part_size, total_num_parts, mod, with some optional pre-checks to return 0 immediately if there's no partition.

Now, suppose you want to do better than O(nmk). Here, things get tricky. If you want to do strictly better, the only way I can imagine is to find the generating function for these partitions, and use FFT or some other fast polynomial multiplication modulo 10**9 + 7. For a start on researching how to do this, I'd recommend this thread on the math stackexchange, which pertains to counting precisely these partitions in terms of more well known partition numbers, whose generating functions are already known. Even so, I was unable to find any mention of whether this generating function has a short representation, and using partition numbers directly can't improve on the O(nmk) complexity.

Using combinatorics

If you still want to use this dynamic programming approach, there's a small modification using combinatorics which may be asymptotically faster when N is larger than M*K: it runs in time O((M*K)^2), which doesn't depend on N. The idea is to use our DP formula, but instead of choosing K distinct integers from [1, ... N], we now choose K (possibly repeated) residue classes from [0, ... M-1].

How does this work? First, we need to count how many ints in [1, ... N] fall into each residue class i mod M. Call this number R[i], for 0 <= i < M. You can calculate this as

R[i] = floor(N/M) + (1 if 0 < i <= N%M else 0)

Now we can write a slightly modified Dynamic Programming definition and formula:

Let dp[i][j][r], 0 <= i <= M; 0 <= j <= K; 0 <= r < M
be the number of combinations with replacement of j ints from 
residue classes [0, 1, ... i-1] with sum congruent to r modulo M. 
We want dp[M][K][0]:

dp[i][j][r] = 1 if i == j == r == 0
              0 if i == 0 and (j /= 0 or r /= 0)
              0 if i < 0 or j < 0
              F(i, j, r) otherwise

F(i, j, r) = Sum from p = 0 to min(R[i], j) of:
(R[i] choose p) * dp[i-1][j-p][(r - i*p) % M]

Upvotes: 6

Venkatesh Dharavath
Venkatesh Dharavath

Reputation: 520

I hope you might have already solved the problem. But still, I'm answering this question for people who find it helpful.

You have tried to get the combinations on your own, but you could've used a library instead to get all the possible combinations and simply iterate through and check for the condition. If the purpose is not only for learning, it's always ok to use already available code.

Anyways, have a look at the code. Thanks.

from itertools import combinations
def getCombi(n, k, m):
    count = 0
    #Required Combinations
    reqcombis = []
    array = [i for i in range(1, n+1)]
    #getting all possible combinations
    totalcombins = combinations(array, k)
    for i in totalcombins:
        if sum(i) % m == 0 and sum(i) <= n:
            count+=1
            reqcombis.append(i) 
    return count, reqcombis

if __name__ == "__main__":
    n, k, m = input().split(",")
    n, k, m = int(n), int(k), int(m)
    print(getCombi(n, k, m))

Upvotes: 0

Related Questions