Rachit Belwariar
Rachit Belwariar

Reputation: 47

How to find number of ways of permuting a string satisfying the below conditions?

I am given a string,let say- "abcd". Now I have to find all the strings that can be generated by permuting its character such that-

  1. There are exactly four mismatches in the generated strings and,
  2. The mismatches exists in pair, for e.g-

The string - "abcd" has three such permutations- "badc","cdab","dcba".

Explanation-

Let us consider "abcd" and "badc". Now there are exactly four mismatch with , i.e- (a,b),(b,a),(c,d),(d,c) and these mismatches exists in pair.

Note that "abcde" has fifteen such permutations- acbed,adebc,aedcb,baced,badce,baedc,cbaed,cdabe,ceadb,dbeac,dcbae,decab,ebdca,ecbda,edcba

Where I am failing?-

I am just finding the strings manually, but this becomes really time-consuming for strings of large length. Hence,I need a efficient solution.

Upvotes: 0

Views: 98

Answers (1)

WhatsUp
WhatsUp

Reputation: 1626

If you have a string of length n, consisting of n different letters, then the number you are finding is: n * (n - 1) * (n - 2) * (n - 3) / 8.

Reason: there are C(n, 4) ways to choose 4 mismatch places from n letters, and for every such quadruple, there are three ways to pair them.

Hence the result is C(n, 4) * 3 = n * (n - 1) * (n - 2) * (n - 3) / 8.


Here the hypothesis is that all letters in the string are different. It is unclear from your description of problem whether letters can repeat. Please comment on this answer if this is your case. I will then update the answer.


Edit: Now suppose that the letters can repeat. The situation is more complicated. I will just give sketches here.

Let's say the string contains m different letters, occurring a_1, ..., a_m times, respectively. Also, write b_i for the number C(a_i, 2).

There are three cases:

  • Case 1: the four mismatches form two identical pairs, e.g. two a's and two b's are permuted.

In this case, we have sum{b_i * b_j : 1 <= i < j <= m} different strings. This is equal to (sum{b_i}^2 - sum{b_i ^2}) / 2, an expression that can be evaluated in O(m) time.

  • Case 2: the four mismatches contain three different letters, e.g. one a is paired with one b, and another a is paired with one c.

In this case, first choose the letter that is common in the two pairs, let's say it's the i-th letter; then one has to choose the other two letters.

Write s for the sum of all a_i, and t for the sum of all b_i. If we choose two arbitrary letter from the remaining s - a_i letters, we have to exclude cases where the two letters are identical, which has t - b_i possibilities. Thus there are C(s - a_i, 2) - (t - b_i) different ways to choose the other two letters, hence b_i * (C(s - a_i, 2) - (t - b_i)) different strings. Summing them up for all i gives the total number of different strings in this case. It still can be evaluated in O(m) time.

  • Case 3: four different letters, e.g. one a and one b form a pair, and one c and one d form another pair.

The idea is the same. First, consider the possibilities of choosing 4 arbitrary letters from all s letters. Then we have to exclude several cases:

i. All four letters are identical, this has sum{C(a_i, 4)} cases;

ii. Three letter are identical, the fourth is different, this has sum{C(a_i, 3) * (s - a_i)} cases;

iii. Two letters are identical, the other two are also identical, this is exactly the number calculated in Case 1;

iv. Two letters are identical, the other two are different, this is exactly the number calculated in Case 2.

So, the total number of possibilities of choosing four different letters from all s letters is: C(s, 4) - [i] - [ii] - [iii] - [iv]. As before, this number should be multiplied by 3 to yield the number of different strings, because each quadruple gives 3 different strings.

All in all, the time complexity is O(m), which is obviously optimal.

Upvotes: 1

Related Questions