Firex Firexo
Firex Firexo

Reputation: 265

How to find the minimum number of operation(s) to make the string balanced?

From Codechef:

A string is considered balanced if and only if all the characters occur in it equal number of times.

You are given a string S; this string may only contain uppercase English letters. You may perform the following operation any number of times (including zero): Choose one letter in S and replace it by another uppercase English letter. Note that even if the replaced letter occurs in S multiple times, only the chosen occurrence of this letter is replaced.

Find the minimum number of operations required to convert the given string to a balanced string.

Example:

For input: ABCB

Here, we can replace C with A, TO GET: ABAB, where each character of the string occurs for 2 times.

So, the number of minimum operation(s)=1.

How to make the string good?

Can I apply dynamic programming to it ?

Upvotes: 9

Views: 37874

Answers (5)

ruakh
ruakh

Reputation: 183341

I don't think you really need dynamic programming here.

One approach in O(length(S)) time:

  • Iterate over S, build a frequency-map (a mapping from distinct letters A–Z to counts). For your ABCB example, that would be A->1 B->2 C->1 D->0 E->0 ... Z->0, which we can represent as the array [1, 2, 1, 0, 0, ..., 0].
    • We can do this because we don't actually care about the positions of letters at all; there's no real difference between ABCB and ABBC, in that each can be balanced by replacing its C with an A.
  • Sort the array. For your example, that gives [0, 0, ..., 0, 1, 1, 2].
    • We can do this because we don't actually care about which letter was which; there's no real difference between ABCB and ABDB, in that each can be balanced by replacing one of its singleton letters with its other one.
  • Assuming the string is nonempty (since if it's empty then the answer is just 0), the final balanced string will contain at least 1 and at most 26 distinct letters. For each integer i between 1 and 26, determine how many operations you'd need to perform in order to produce a balanced string with i distinct letters:
    • First, check that length(S) is a multiple of i; if not, this isn't possible, so skip to the next integer.
    • Next, compute length(S)/i, the count of each distinct letter in the final balanced string.
    • To count how many operations need to be performed, we're going to examine all the counts that need to be increased. (We could, equivalently, examine the counts that need to be decreased: they'll have to match.)
    • We're only interested in the last i elements of the sorted array. Any elements before that point represent letters that won't occur in the balanced string: either the counts are already zero and will remain so, or they are nonzero but will be decreased to zero. Either way, since we're only tracking increases, we can ignore them.
    • For each of the last i counts that are less than length(S)/i, add the difference. This sum is the number of operations. (Note that, since the counts are sorted, you can exit this inner loop as soon as you get to a count that's greater than or equal to the target count.)
    • You can exit this loop after the first i that's greater than or equal to the number of distinct letters in the original S (aside from values of i that we had to skip because they don't evenly divide length(S)). For example, if length(S) = 100, and the original S has 19 distinct letters, then we only need to consider i as high as 20. (Hat-tip to Eric Wang for suggesting something along these lines.)
  • Return the least of these up-to-26 sums. (Note that you don't actually need to store all the sums; you can just keep track of the minimum.)

Upvotes: 17

Santosh
Santosh

Reputation: 41

if __name__ == "__main__":
  for j in range(int(input())):
    S = str(input())
    N = len(S)
    A = [0]*27
    for c in S:
      A[ord(c) - 65] = A[ord(c) - 65] + 1
    A = sorted(A,reverse=True)
    minSwap = N + 1
    for i in range(1,27):
      if N%i == 0:
        temp = N//i
        tempSwap = 0
        for f in range(i):
          if temp > A[f]:
            tempSwap = tempSwap + temp - A[f]
        if tempSwap <= minSwap:
           minSwap = tempSwap
    if minSwap == N+1:
        minSwap = 0
    print(minSwap)

Upvotes: 1

swapnil
swapnil

Reputation: 29

#include <iostream>
#include <string>
#include <vector>

int countOps(std::vector<int> &map, int requiredFreq) {
    int countOps = 0, greaterFreq = 0, lesserFreq = 0;
    for (auto a : map) {
        if (a > 0 && a < requiredFreq) {
            lesserFreq =  lesserFreq + abs(requiredFreq - a);
        }
        else if (a > 0 && a > requiredFreq) {
            greaterFreq =  greaterFreq + abs(requiredFreq - a);
        }
    }
    
    countOps = greaterFreq > lesserFreq ? (lesserFreq + (greaterFreq - lesserFreq)) : lesserFreq;
    
    return countOps;
}

int balanceString(std::string &s, long n){
    
    std::vector<int> map(26, 0);
    int distinctChar = 0;
    int requiredFreq = -1;
    int count = INT_MAX;
    
    // Creating map with frequency and counting distinct chars
    for (char x : s) {
        if (map[x - 'a'] == 0) distinctChar++;
        map[x - 'a']++;
    }
    
    std::sort(std::begin(map), std::end(map), std::greater<int>{});

    // If size is multiple of distinctChar
    if (n % distinctChar == 0) {
        requiredFreq = int(n / distinctChar);
        count = countOps(map, requiredFreq);
    }
    else {
        for (int i = 1; i < distinctChar;  i++) {
            if (n % i == 0){
                requiredFreq = int(n / i);
                
                std::vector<int> temp(map.begin(), map.begin() + i);
                int x = countOps(temp, requiredFreq);
                count = std::min(count, x);
            }
        }
    }
    
    return count;
}

int main() {
    std::string s = "aaaaccddefghiii";
    long n = s.size();
    
    if(n <= 1) return 0;
    
    int count = balanceString(s, n);

    std::cout << count << std::endl;
    
    return 0;
}

Upvotes: 0

Eric
Eric

Reputation: 24920

Following code implement the solution, in Java, with unit test.

The algorithm is almost identical as in @ruakh's answer, if not identical.


Code

BalanceString.java

import java.util.Arrays;
import java.util.Collections;
import java.util.HashMap;
import java.util.Map;

/**
 * Assume string only contains A-Z, the 26 uppercase letter,
 * <p>given a string, you can replace a char with another char from the 26 letter,
 * <p>figure out the minimum replacement required to make the string balance,
 * <p>which means each char in the string occurs the same time,
 *
 * @author eric
 * @date 2/2/19 8:54 PM
 */
public class BalanceString {
    private final char minChar;
    private final char maxChar;
    private final int distinctChars; // total distinct char count,

    public static final BalanceString EN_UPPER_INSTANCE = new BalanceString('A', 'Z');

    public BalanceString(char minChar, char maxChar) {
        this.minChar = minChar;
        this.maxChar = maxChar;
        this.distinctChars = maxChar - minChar + 1;

        if (distinctChars <= 0)
            throw new IllegalArgumentException("invalid range of chars: [" + minChar + ", " + maxChar + "]");
    }

    /**
     * Check minimal moves needed to make string balanced.
     *
     * @param str
     * @return count of moves
     */
    public int balanceCount(String str) {
        // System.out.printf("string to balance:\t%s\n", str);
        int len = str.length(); // get length,
        if (len <= 2) return 0; // corner cases,

        Map<Character, Integer> coMap = figureOccurs(str); // figure occurrences,
        Integer[] occurs = sortOccursReversely(coMap); // reversely order occurrences,

        int m = coMap.size(); // distinct char count,
        int maxN = (len < distinctChars ? len : distinctChars); // get max possible distinct char count, included,

        int smallestMoves = Integer.MAX_VALUE; // smallest moves, among all possible n,

        // check each possible n, and get its moves,
        for (int n = 1; n <= maxN; n++) {
            if (len % n == 0) {
                int moves = figureMoves(len, coMap, occurs, m, n);
                if (moves < smallestMoves) smallestMoves = moves;
            }
        }

        return smallestMoves;
    }

    /**
     * Figure occurs for each char.
     *
     * @param str
     * @return
     */
    protected Map<Character, Integer> figureOccurs(String str) {
        Map<Character, Integer> coMap = new HashMap<>();
        for (char c : str.toCharArray()) {
            if (c < minChar || c > maxChar)
                throw new IllegalArgumentException(c + " is not within range 'A-Z'");

            if (!coMap.containsKey(c)) coMap.put(c, 1);
            else coMap.put(c, coMap.get(c) + 1);
        }

        return coMap;
    }

    /**
     * Get reverse sorted occurrences.
     *
     * @param coMap
     * @return
     */
    protected Integer[] sortOccursReversely(Map<Character, Integer> coMap) {
        Integer[] occurs = new Integer[coMap.size()];

        coMap.values().toArray(occurs);
        Arrays.sort(occurs, Collections.reverseOrder());

        return occurs;
    }

    /**
     * Figure moves needed to balance.
     *
     * @param len   length of string,
     * @param coMap
     * @param m     original distinct elements count,
     * @param n     new distinct elements count,
     * @return count of moves,
     */
    protected int figureMoves(int len, Map<Character, Integer> coMap, Integer[] occurs, int m, int n) {
        int avgOccur = len / n; // average occurrence,
        int moves = 0;

        if (n == m) { // distinct chars don't change,
            for (Integer occur : occurs) {
                if (occur <= avgOccur) break;
                moves += (occur - avgOccur);
            }
        } else if (n < m) { // distinct chars decrease,
            for (int i = 0; i < n; i++) moves += Math.abs(occurs[i] - avgOccur); // for elements kept,
            for (int i = n; i < m; i++) moves += occurs[i]; // for elements to replace,
            moves >>= 1;
        } else { // distinct chars increase,
            for (int i = 0; i < occurs.length; i++) moves += Math.abs(occurs[i] - avgOccur); // for existing elements,
            moves += ((n - m) * avgOccur); // for new elements,
            moves >>= 1;
        }

        return moves;
    }

    public char getMinChar() {
        return minChar;
    }

    public char getMaxChar() {
        return maxChar;
    }

    public int getDistinctChars() {
        return distinctChars;
    }
}

BalanceStringTest.java
(Unit test, via TestNG)

import org.testng.Assert;
import org.testng.annotations.Test;

/**
 * BalanceString test.
 *
 * @author eric
 * @date 2/2/19 9:36 PM
 */
public class BalanceStringTest {
    private BalanceString bs = BalanceString.EN_UPPER_INSTANCE;

    @Test
    public void test() {
        // n < m case,
        Assert.assertEquals(bs.balanceCount("AAAABBBC"), 1); // e.g 1A -> B,
        Assert.assertEquals(bs.balanceCount("AAAAABBC"), 2); // e.g 1A -> B, 1C -> B,
        Assert.assertEquals(bs.balanceCount("AAAAAABC"), 2); // e.g 1B -> A, 1C -> A,
        Assert.assertEquals(bs.balanceCount("AAAAAAAB"), 1); // e.g 1B -> A,

        // n > m case,
        Assert.assertEquals(bs.balanceCount("AAAABBBBCCCCDDDDEEEEAAAA"), 4); // add 1 new char, e.g change 4 A to 4 F,
        Assert.assertEquals(bs.balanceCount(genIncSeq(10)), 15); // A-J, 10 distinct chars, 55 in length; solved by add 1 new char, need 15 steps,

        // n == m case,
        Assert.assertEquals(bs.balanceCount(genIncSeq(3)), 1); // A-C, 3 distinct chars, 6 in length; symmetric, solved with same distinct chars, need 1 steps,
        Assert.assertEquals(bs.balanceCount(genIncSeq(11)), 15); // A-K, 11 distinct chars, 66 in length; symmetric, solved with same distinct chars, need 15 steps,

        // n < m, or n > m case,
        Assert.assertEquals(bs.balanceCount("ABAC"), 1); // e.g 1A -> B, or 1A -> D,
    }

    // corner case,
    @Test
    public void testCorner() {
        // m <= 2,
        Assert.assertEquals(bs.balanceCount(""), 0);
        Assert.assertEquals(bs.balanceCount("A"), 0);
        Assert.assertEquals(bs.balanceCount("AB"), 0);
        Assert.assertEquals(bs.balanceCount("AA"), 0);

        /*------ m == n == distinctChars ------*/
        String mndBalanced = genMndBalancedSeq(); // each possible char occurs exactly once, already balanced,
        Assert.assertEquals(mndBalanced.length(), bs.getDistinctChars());
        Assert.assertEquals(bs.balanceCount(mndBalanced), 0); // no need change,

        char lastChar = mndBalanced.charAt(mndBalanced.length() - 1);
        String mndOneDup = mndBalanced.replace(lastChar, (char) (lastChar - 1)); // (distinctChars -2) chars occur exactly once, one occurs twice, one is missing, thus it's one step away to balance,
        Assert.assertEquals(mndOneDup.length(), bs.getDistinctChars());
        Assert.assertEquals(bs.balanceCount(mndOneDup), 1); // just replace the duplicate char with missing char,
    }

    // invalid input,
    @Test(expectedExceptions = IllegalArgumentException.class)
    public void testInvalidInput() {
        Assert.assertEquals(bs.balanceCount("ABAc"), 1);
    }

    // invalid char range, for constructor,
    @Test(expectedExceptions = IllegalArgumentException.class)
    public void testInvalidRange() {
        new BalanceString('z', 'a');
    }

    /**
     * Generate a string, with first char occur once, second twice, third three times, and so on.
     * <p>e.g A, ABB, ABBCCC, ABBCCCDDDD,
     *
     * @param m distinct char count,
     * @return
     */
    private String genIncSeq(int m) {
        StringBuilder sb = new StringBuilder();
        for (int i = 0; i < m; i++) {
            for (int j = 0; j <= i; j++) sb.append((char) (bs.getMinChar() + i));
        }
        return sb.toString();
    }

    /**
     * Generate a string that contains each possible char exactly once.
     * <p>For [A-Z], it could be: "ABCDEFGHIJKLMNOPQRSTUVWXYZ".
     *
     * @return
     */
    private String genMndBalancedSeq() {
        StringBuilder sb = new StringBuilder();
        char minChar = bs.getMinChar();
        int distinctChars = bs.getDistinctChars();

        for (int i = 0; i < distinctChars; i++) {
            sb.append((char) (minChar + i));
        }
        return sb.toString();
    }
}

All test cases would pass.


Complexity

  • Time: O(len) + O(m * lg(m)) + O(m * factorCount)
    • Each sequential scan takes O(len), there are several sequential loop.
    • Sorting of array takes O(m*lg(m)), which is at most O(distinctChars * lg(distinctChars)), thus constant, and only sort once.
    • To figure out moves for each n, takes O(m).
    • The count of n that needs to figure moves, depends on count of divisible numbers for len, in the range [minChar, maxChar].
      This count if kind small & constant too.
  • Space: O(len)
    • Input string need O(len).
    • Counter hashmap need O(m).
    • Sorted occurrence array need O(m).

Where:

  • len is string length.
  • m is distinct char count in original string
  • distinctChars is distinct char count, e.g 26.
  • maxN max possible distinct char count, included,
  • factorCount divisible number count in range [1, n], by len,
  • minChar min char, e.g A
  • maxChar max char, e.g Z

And:

  • len >= m
  • m <= distinctChars

Upvotes: 1

Aashish Kumar
Aashish Kumar

Reputation: 1006

Towards solving this problem ,I think it is also useful to find out all over extra presence( sum of number of elements which present more than one times) of different elements in string

for ex: in aabbc ,number of elements we have to remove to make presence of each element one is equal to 2 (this is called good string)

`x=input()
char=26
total=0
lis=[0]*char
#print(lis)

for i in range(len(x)):
    lis[ord(x[i])-ord('a')]+=1
#print(lis) 

for i in range(26):
    if(lis[i]>1):
        total=total+(lis[i]-1)
print(total)        

`

Upvotes: 0

Related Questions