MocTriPsy
MocTriPsy

Reputation: 61

Number of Ways To arrange Sequence

I am having a M character, from these character i need to make a sequence of length N such that no two consecutive character are same and also first and last character of the sequence is fix. So i need to find the total number of ways.

My Approach:

Dynamic programming. If first and last character are '0' and '1'

dp[1][0]=1 , dp[1][1]=1

for(int i=2;i<N;i++)
    for(int j=0;j<M;j++)
         for(int k=0;k<M;k++)
                  if(j!=k) dp[i][j]+=dp[i-1][k]

So final answer would summation dp[n-1][i] , i!=1

Problem:

Here length N is too large around 10^15 and M is around 128, how find the number of permutation without using arrays ?

Upvotes: 2

Views: 129

Answers (2)

Paul Hankin
Paul Hankin

Reputation: 58271

Assume M is fixed. Let D(n) be the number of sequences of length n with no repeated characters where the first and last character differ (but are fixed). Let S(n) be the number of sequences of length n where the first and last characters are the same (but are fixed).

For example, D(6) is the number of strings of the form a????b (for some a and b -- noting that for counting it doesn't matter which two characters we chose, and where the ? represent other characters). Similarly, S(6) is the number of strings of the form a????a.

Consider a sequence of length n>3 of the form a....?b. The ? can be any of m-1 characters (anything except b). One of these is a. So D(n) = S(n-1) + (m-2)D(n-1). Using a similar argument, one can figure out that S(n) = (M-1)D(n-1).

For example, how many strings are there of the form a??b? Well, the character just before the b could be a or something else. How many strings are there when it's a? Well, it's the same as the number of strings of the form a?a. How many strings are there when it's something else? Well it's the same as the number of strings of the form a?c multiplied by the number of choices we had for c (namely: m-2 -- everything except for a which we've already counted, and b which is excluded by the rules).

If n is odd, we can consider the middle character. Consider a sequence of length n of the form a...?...b. The ? (which is in the center of the string) can be a, b, or one of the other M-2 characters. Thus D(2n+1) = S(n+1)D(n+1) + D(n+1)S(n+1) + (M-2)D(n+1)D(n+1). Similarly, S(2n+1) = S(n+1)S(n+1) + (M-1)D(n+1)D(n+1).

For small n, S(2)=0, S(3)=M-1, D(2)=1, D(3)=M-2.

We can use the above equations (the first set for even n>3, the second set for odd n>3, and the base cases for n=2 or 3 to compute the result you need in O(log N) arithmetic operations. Presumably the question asks you to compute the result modulo something (since the result grows like O(M^(N-2)), but that's easy to incorporate into the results.

Working code that uses this approach:

def C(n, m, p):
    if n == 2:
        return 0, 1
    if n == 3:
        return (m-1)%p, (m-2)%p
    if n % 2 == 0:
        S, D = C(n-1, m, p)
        return ((m-1) * D)%p, (S + (m-2) * D)%p
    else:
        S, D = C((n-1)//2+1, m, p)
        return (S*S + (m-1)*D*D)%p, (2*S*D + (m-2)*D*D)%p

Note that in this code, C(n, m, p) returns two numbers -- S(n)%p and D(n)%p.

For example:

>>> p = 2**64 - 59  # Some large prime
>>> print(C(4, 128, p))
>>> print(C(5, 128, p))
>>> print(C(10**15, 128, p))

(16002, 16003)
(2032381, 2032380)
(12557489471374801501, 12557489471374801502)

Looking at these examples, it seems like D(n) = S(n) + (-1)^n. If that's true, the code can be simplified a bit I guess.

Another, perhaps easier, way to do it efficiently is to use a matrix and the first set of equations. (Sorry for the ascii art -- this diagram is a vector = matrix * vector):

(D(n)) = (M-2  1) * (D(n-1))
(S(n)) = (M-1  0)   (S(n-1))

Telescoping this, and using that D(2)=1, S(2)=0:

(D(n)) = (M-2  1)^(n-2) (1)
(S(n)) = (M-1  0)       (0)

You can perform the matrix power using exponentiation by squaring in O(log n) time.

Here's working code, including the examples (which you can check produce the same values as the code above). Most of the code is actually matrix multiply and matrix power -- you can probably replace a lot of it with numpy code if you use that package.

def mat_mul(M, N, p):
    R = [[0, 0], [0, 0]]
    for i in range(2):
        for j in range(2):
            for k in range(2):
                R[i][j] += M[i][k] * N[k][j]
                R[i][j] %= p
    return R

def mat_pow(M, n, p):
    if n == 0:
        return [[1, 0], [0, 1]]
    if n == 1:
        return M
    if n % 2 == 0:
        R = mat_pow(M, n//2, p)
        return mat_mul(R, R, p)
    return mat_mul(M, mat_pow(M, n-1, p), p)

def Cmat(n, m, p):
    M = [((m-2), 1), (m-1, 0)]
    M = mat_pow(M, n-2, p)
    return M[1][0], M[0][0]

p = 2**64 - 59
print(Cmat(4, 128, p))
print(Cmat(5, 128, p))
print(Cmat(10**15, 128, p))

Upvotes: 2

k_ssb
k_ssb

Reputation: 6252

You only need to count the number of acceptable sequences, not find them explicitly. It turns out that it doesn't matter what the majority of the characters are. There are only 4 kinds of characters that matter:

  • The first character
  • The last character
  • The last-used character, so you don't repeat characters consecutively
  • All other characters

In other words, you don't need to iterate over all 10^15 characters. You only need to consider the four cases above, since most characters can be lumped together into the last case.

Upvotes: 1

Related Questions