justtestingout
justtestingout

Reputation: 31

Cannot convert Char and Int HashMap to a string

I've been trying to figure out how to convert all of the elements (both keys and values) to a string. The result should look something like "a2b3c4" for example.


import java.util.*;

class Main{
  public static String stringCompression(String s){

    s = s.toLowerCase();
    HashMap<Character,Integer> map = new HashMap<>();

  for (int i = 0; i < s.length(); i++){
    char c = s.charAt(i);
    if (map.containsKey(c)){
        map.put(c, map.get(c) + 1);
    }
    else{
      map.put(c, 1);
    }
  }

  StringBuilder sb = new StringBuilder(); 
  for (int i = 0; i < map.size(); i++){
    sb.append(map.get(i));
  }
  
  return sb.toString();

  }

  public static void main (String[] args){
    System.out.println(stringCompression("aabccccaaa"));
    System.out.println(stringCompression("abc"));
    System.out.println(stringCompression("AabBcCdDdDDeE"));
  }
}

Upvotes: 0

Views: 525

Answers (3)

Basil Bourque
Basil Bourque

Reputation: 338346

tl;dr

map.get( i ) is incorrect. Your keys are Character objects, not the int i used in your loop.

Call Map#keySet, and loop those key objects rather than incrementing an int i. Replace your for (int i = 0; i < map.size(); i++){ with for ( Character key : map.keySet() )

Details

By the way, you use the word "compression" but what you are doing is not compression. You are counting total repetition of each character, rather than counting each series of a repeated character separately.

Your code for populating the map is correct. We can see that by doing a println to see the map’s contents. For input "aabccccaaa" we get this map: map.toString(): {a=5, b=1, c=4}.

The problem is where you build up text with StringBuilder. You call map.get( i ). This means you are asking for the value mapped to a key of 0, 1, 2, and such. But you are not using integer numbers as your key. You are using Character as your key type.

Instead, you should be looping on the set of all keys. Then do a get for each of those keys. Your StringBuilder code should look like this:

StringBuilder sb = new StringBuilder();
Set < Character > keys = map.keySet();
for ( Character key : keys )
{
    Integer count = map.get( key );
    sb.append( key ).append( count );
}

By the way, your insertions into the map could be made into a single line of call by using Map#merge method. So this:

    if (map.containsKey(c)){
        map.put(c, map.get(c) + 1);
    }
    else{
      map.put(c, 1);
    }

…becomes this:

map.merge( c , 1 , Integer :: sum );

The Integer :: sum is a method reference. A method reference is an object that represents that method rather than calls that method. The merge method will call that sum method while performing its work.

By the way, your line s = s.toLowerCase(); replaces the passed argument String object with a new String object. I do not recommend this. You have discarded the original text with nothing gained. You have tossed valuable information needed for debugging and logging. Instead, define a new String variable: String input = s.toLowerCase();. And to prevent such re-assignment of the argument variable, mark the argument declaration as final.

Complete code example:

package work.basil.example;

import java.util.HashMap;
import java.util.Set;

public class RepeatCounter
{
    public static void main ( String[] args )
    {
        System.out.println( countCharacterOccurrence( "aabccccaaa" ) );
        System.out.println( countCharacterOccurrence( "abc" ) );
        System.out.println( countCharacterOccurrence( "AabBcCdDdDDeE" ) );
    }

    public static String countCharacterOccurrence ( final String s )
    {

        String input = s.toLowerCase();
        HashMap < Character, Integer > map = new HashMap <>();

        for ( int i = 0 ; i < input.length() ; i++ )
        {
            char c = input.charAt( i );
            map.merge( c , 1 , Integer :: sum );
        }
        // System.out.println( "map = " + map );

        StringBuilder sb = new StringBuilder();
        Set < Character > keys = map.keySet();
        for ( Character key : keys )
        {
            Integer count = map.get( key );
            sb.append( key ).append( count );
        }

        return sb.toString();
    }
}

When run.

a5b1c4
a1b1c1
a2b2c2d5e2

You may want to use another Map implementation rather than HashMap. The HashMap class makes no promise as to the iteration order of its keys.

If you want the keys kept in sorted order, use a SortedMap/NavigableMap implementation such as TreeMap.

If you want the keys kept in their original insertion order, use LinkedHashMap.

Here is a graphic table I made of the Map implementations bundled with Java.

Table of map implementations in Java 11, comparing their features


By the way, the char/Character type in Java is obsolete, unable to represent even half of the 143,859 characters defined in Unicode.

While your code can handle an input of "Hello World", it will fail with an input of "Hello 👋 world 🗺️".

Learn to use integer code point numbers instead. So your Map would be a Map < Integer , Integer > where the first Integer represents the code point number of each character, and the second Integer is the count of its occurrences.

For more info, read: The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!).

Upvotes: 2

Thiyanesh
Thiyanesh

Reputation: 2360

Issues

  1. Unfortunately, using a map will not give the correct to your question.
  2. If you accumulate the complete string and add to Map, then you will add same character across different parts into same key
  3. This will result in incorrect solution
  4. Incorrect: ex: aaabbbaaa -> a6b3 (assuming you use LinkedHashMap)
  5. The actual output should be a3b3a3

Code as per Question

  map.entrySet().stream()
    .map(e -> String.valueOf(e.getKey()) + e.getValue())
    .collect(Collectors.joining(""))

Run Length Encoding: A possible (still inefficient compression) solution

  1. Iterate over characters from left to right
  2. combine any adjacent similar characters along with count
  3. This does not compress subpatterns
    public static String compress(String str) {
        if (str == null || str.length() < 3) {
            return str;
        }
        char[] input = str.toCharArray();
        int left = 0;
        char state = input[0];

        StringBuilder builder = new StringBuilder();
        for (int right = 1; right < input.length; right++) {
            final char candidate = input[right];
            // on mismatch add the previously noted character
            if (state != candidate) {
                if (right - left > 1) {
                    builder.append(state).append(right - left);
                } else {
                    builder.append(state);
                }
                // watch the current character
                left = right;
                state = candidate;
            }
        }
        // add the last unique character
        builder.append(state);
        if (input.length - left > 1) {
            builder.append(input.length - left);
        }
        System.out.println(builder.toString());
        return builder.toString();
    }

    public static void main(String[] args) {
        compress("abc"); // abc
        compress("aabbc"); // a2b2c
        compress("aabbcaabbc"); // a2b2ca2b2c actually this should be (a2b2c)2

    }

Upvotes: 0

Nowhere Man
Nowhere Man

Reputation: 19545

It can be done conveniently using Stream API:

  1. Prepare frequency map
  2. Convert map entries to String
  3. Join strings without delimiter
public static String stringCompression(String s) {
    return s.toLowerCase().chars()
            .mapToObj(c -> (char)c)
            .collect(Collectors.groupingBy(c -> c, Collectors.summingInt(c -> 1)))
            .entrySet()
            .stream()
            .map(e -> e.getKey() + Integer.toString(e.getValue()))
            .collect(Collectors.joining());
}

Test:

System.out.println(stringCompression("aabccccaaa"));
System.out.println(stringCompression("abc"));
System.out.println(stringCompression("AabBcCdDdDDeE"));

Output:

a5b1c4
a1b1c1
a2b2c2d5e2

Upvotes: 0

Related Questions