Zorg
Zorg

Reputation: 71

Count unique chars and validate String in some cases using Java Stream

I'm trying to write a method that will validate String. If string has same amount of every char like "aabb", "abcabc", "abc" it is valid or if contains one extra symbol like "ababa" or "aab" it is also valid other cases - invalid. Update: sorry, I forget to mention such cases like abcabcab -> a-3, b-3, c-2 -> 2 extra symbols (a, b) -> invalid. And my code doesn't cover such cases. Space is a symbol, caps letters are different from small letters. Now I have this, but it looks ambiguous (especially last two methods):

public boolean validate(String line) {
    List<Long> keys = countMatches(countChars(line));
    int matchNum = keys.size();
    if (matchNum < 2) return true;
    return matchNum == 2 && Math.abs(keys.get(0) - keys.get(1)) == 1;
}

Counting unique symbols entry I'd wish to get List<long>, but I don't know how:

private Map<Character, Long> countChars(String line) { 
    return line.chars()
               .mapToObj(c -> (char) c)
               .collect(groupingBy(Function.identity(), HashMap::new, counting()));
}


private List<Long> countMatches(Map<Character, Long> countedEntries) {
    return new ArrayList<>(countedEntries.values()
            .stream()
            .collect(groupingBy(Function.identity(), HashMap::new, counting()))
            .keySet());
}

How can I optimize a method above? I need just List<Long>, but have to create a map.

Upvotes: 4

Views: 953

Answers (4)

pero_hero
pero_hero

Reputation: 3184

you could perform an evaluation if every char in a string has the same occurence count using the stream api like this:

boolean valid = "aabbccded".chars()
      .boxed()  
      .collect(Collectors.groupingBy(Function.identity(), Collectors.counting()))                      
      .values().stream()
      .reduce((a, b) -> a == b ? a : -1L)
      .map(v -> v > 0)
      .get();

EDIT:

after reading the comments, I now believe to have understood the requirement.

  1. a string is considered valid if all chars in it have the same occurrence count like aabb
  2. or if there is a single extra character like abb
  3. the string abcabcab is invalid as it has 3a 3b and 2c and thus, it has 1 extra a and 1 extra b, that is too much. hence, you can't perform the validation with a frequency list, you need additional information about how often the char lengths differ -> Map

here is a new trial:

TreeMap<Long, Long> map = "abcabcab".chars()
                .boxed()
                .collect(groupingBy(Function.identity(), counting()))
                .values().stream()
                .collect(groupingBy(Function.identity(), TreeMap::new, counting()));

boolean valid = map.size() == 1 ||        // there is only a single char length
        ( map.size() == 2 &&              // there are two and there is only 1 extra char
        ((map.lastKey() - map.firstKey()) * map.lastEntry().getValue() <= 1));

the whole validation could be executed in a single statement by using the Collectors.collectingAndThen method that @Nikolas used in his answer or you could use a reduction as well:

boolean valid = "aabcc".chars()
    .boxed()
    .collect(groupingBy(Function.identity(), counting()))
    .values().stream()
    .collect(groupingBy(Function.identity(), TreeMap::new, counting()))
    .entrySet().stream()
    .reduce((min, high) -> {
         min.setValue((min.getKey() - high.getKey()) * high.getValue()); // min.getKey is the min char length
         return min;                                                     // high.getKey is a higher char length
                                                                         // high.getValue is occurrence count of higher char length
        })                                                               // this is always negative
    .map(min -> min.getValue() >= -1)
    .get();

Upvotes: 2

Hadi
Hadi

Reputation: 17289

You can do like this:

  1. first count every character occurrence.
  2. then find min value for occurrence.
  3. and at the last step sum all values that the difference with the smallest value(minValue) is less than or equal to one.

    public static boolean validate(String line) {
        Map<Character, Long> map = line.chars()
                     .mapToObj(c -> (char) c)
                     .collect(groupingBy(Function.identity(), Collectors.counting()));
        long minValue = map.values().stream().min(Long::compareTo).orElse(0l);
        return map.values().stream().mapToLong(a -> Math.abs(a - minValue)).sum() <= 1;
    }
    

Upvotes: 1

Nikolas
Nikolas

Reputation: 44398

Use Collector.collectingAndThen that is a collector that uses a downstream Collector and finisher Function that maps the result.

  • Use the Collectors.groupingBy and Collectors.counting to get the frequency of each character in the String.

    // Results in Map<Integer, Long>
    .collect(Collectors.groupingBy(Function.identity(), Collectors.counting())
    
  • Use the map -> new HashSet<>(map.values()).size() == 1 that checks whether all frequencies are equal - if so, there is one distinct value.

Wrapping these two in Collector.collectingAndThen looks like:

String line = "aabbccdeed";
boolean isValid = line.chars()                          // IntStream of characters    
    .boxed()                                            // boxed as Stream<Integer>
    .collect(Collectors.collectingAndThen(              // finisher's result type
        Collectors.groupingBy(                          // grouped Map<Integer, Integer>
                Function.identity(),                    // ... of each character
                Collectors.counting()),                 // ... frequency
        map -> new HashSet<>(map.values()).size() == 1  // checks the frequencies
    ));

// aabbccded  -> false
// aabbccdeed -> true

Upvotes: 1

Naman
Naman

Reputation: 31878

As I could observe, you are looking for distinct frequencies using those two methods. You can merge that into one method to use a single stream pipeline as below :

private List<Long> distinctFrequencies(String line) {
    return line.chars().mapToObj(c -> (char) c)
            .collect(Collectors.groupingBy(Function.identity(),
                    Collectors.counting()))
            .values().stream()
            .distinct()
            .collect(Collectors.toList());
}

Of course, all you need to change in your validate method now is the assignment

List<Long> keys = distinctFrequencies(line);

With some more thought around it, if you wish to re-use the API Map<Character, Long> countChars somewhere else as well, you could have modified the distinct frequencies API to use it as

private List<Long> distinctFrequencies(String line) {
    return countChars(line).values()
            .stream()
            .distinct()
            .collect(Collectors.toList());
}

Upvotes: 3

Related Questions