Zootopia
Zootopia

Reputation: 121

Print the most repeated word in a string

Write a java program to find the most repeated word in a string and also print its frequency.


INPUT

are you are

OUTPUT

are: 2


This question can be done by using HashMap or file reader (I suppose) but actually, I haven't learned them yet.

Yet, I managed to write a code that displays the frequency (but not the word)

import java.util.Scanner;
class duplicatewords
   {
       void main()
       {
           Scanner sc=new Scanner(System.in);
           System.out.println("Enter the string");
           String str=sc.nextLine();
           String arr[]=str.split(" ");
           int count=1; int checkvalue=0;
           for(int i=0;i<arr.length-1;i++)
           {
               String temp=arr[i];
               for(int j=i+1;j<arr.length;j++)
               {
                   String anothertemp=arr[j];
                   if(temp.equalsIgnoreCase(anothertemp))
                   count++;
                }
                if(checkvalue<c)
                checkvalue=c;
                c=1;
            }
            System.out.println(checkvalue);
        }
    } 

I want to know how to print the word also without using any map or reader.

I think the program will be a very complicated one but I will understand.

Any help will be greatly appreciated.

Upvotes: 1

Views: 1824

Answers (4)

WJS
WJS

Reputation: 40047

Here is one way. It just maintains a list of words to count and adjust the max while iterating thru the list.

  • scratch arrays are allocated to the max number of words.
  • cnt is adjust to control iterating thru updated arrays based on discovered words.
String s =
        "this when this how to now this why apple when when other now this now";
String[] words = s.split("\\s+");
int cnt = 0;
int idx = -1;
String[] list = new String[words.length];
int[] count = new int[words.length];
int max = 0;
for (int i = 0; i < words.length; i++) {
    for (int k = 0; k < cnt; k++) {
        if (list[k].equals(words[i])) {
            count[k]++;
            if (count[k] > max) {
                max = count[k];
                idx = k;
            }
            break;
        }
    }
    count[cnt] = 1;
    list[cnt++] = words[i];
}

System.out.println(words[idx] + " " + max);

prints

this 4

And here is yet another solution using streams. This simply creates a map of word counts then finds the first entry that has the greatest count. Ties are ignored.

Entry<String, Integer> result = Arrays.stream(s.split("\\s+"))
        .collect(Collectors.toMap(r -> r, q -> 1,
                (a, b) -> a + 1))
        .entrySet().stream().max(Entry.comparingByValue())
        .get();

System.out.println(result);

prints

this=4

Upvotes: 2

hfontanez
hfontanez

Reputation: 6188

There are two solutions, but one is better than the other.

Solution one: use Map<String, Integer>

public class WordCount {
    public static void main(String[] args) {
        String phrase = "This example is very good but is not very efficient";
        String[] words = phrase.split(" ");
        List<String> wordList = Arrays.asList(words);
        
        Map<String, Integer> wordCountMap = wordList.parallelStream().
                collect(Collectors.toConcurrentMap(
                        w -> w, w -> 1, Integer::sum));
            
        System.out.println(wordCountMap);
    }
}

This produces the following output:

{but=1, very=2, not=1, efficient=1, This=1, is=2, good=1, example=1}

As you can see, very and is are tied for the most frequent word. This is why this solution is not the most efficient. What if you could reverse the map in order to group words with similar frequency together?

Solution two: use Map<Integer, List<String>>

In my opinion, this is a better solution. This solution will group all words with similar count. Using the same input of the solution above, words with similar frequency will be bundled together. So, when querying the map for the highest count, both very and is will be returned as expected. Using Lambda expressions makes this task of "inverting" the map very easy:

Map<Integer, List<String>> mapInverted = 
        wordCountMap.entrySet()
           .stream()
           .collect(Collectors.groupingBy(Map.Entry::getValue, Collectors.mapping(Map.Entry::getKey, Collectors.toList())));
        
System.out.println(mapInverted);

After adding these lines to the sample code from solution one, I now have an aggregation of words of similar word counts:

{1=[but, not, efficient, This, good, example], 2=[very, is]}

For both of these approaches, the way to get the max value:


Entry<String, Integer> maxEntry = Collections.max(wordCountMap.entrySet(),
                (Entry<String, Integer> e1, Entry<String, Integer> e2) -> e1.getValue().compareTo(e2.getValue())); // for solution one
System.out.println(maxEntry); // outputs: very=2

Entry<Integer, List<String>> maxKey = Collections.max(mapInverted.entrySet(),
                (Entry<Integer, List<String>> e1, Entry<Integer, List<String>> e2) -> e1.getKey().compareTo(e2.getKey())); // for solution two
System.out.println(maxKey); // outputs: 2=[very, is]

Upvotes: 1

Nowhere Man
Nowhere Man

Reputation: 19575

In fact, to get the most frequent word, existing code needs to be modified slightly just to provide a variable for the most repeated variable which has to be updated when a more frequent word is detected. No extra arrays/data structures is needed for this specific task.

String arr[] = str.split(" ");
int maxFreq = 0;
String mostRepeated = null;

for (int i = 0; i < arr.length; i++) {
    String temp = arr[i];
    int count = 1;
    for (int j = i + 1; j < arr.length; j++) {
        if (temp.equalsIgnoreCase(arr[j]))
            count++;
    }
    if (maxFreq < count) {
       maxFreq = count;
       mostRepeated = temp;
    }
}
System.out.println(mostRepeated + ": " + maxFreq);

For the input:

String str = "I am he as you are he as you are me and we are all together";

Output:

are: 3

A bit faster implementation could include setting the duplicated values to null to skip them later:

for (int i = 0; i < arr.length; i++) {
    if (null == arr[i]) continue;
    String temp = arr[i];
    int count = 1;
    for (int j = i + 1; j < arr.length; j++) {
        if (temp.equalsIgnoreCase(arr[j])) {
            count++;
            arr[j] = null;
        }
    }
    if (maxFreq < count) {
       maxFreq = count;
       mostRepeated = temp;
    }
}

Upvotes: 2

f1sh
f1sh

Reputation: 11941

Here is my solution using 2 arrays:

public static void main(String[] args) {
    String input = "are you are";
    String[] words = input.split(" ");

    //the 10 is the limit of individual words:
    String[] wordsBucket = new String[10];
    Integer[] countBucket = new Integer[10];

    for(String word:words){
        int index = findIndex(word, wordsBucket);
        incrementIndex(countBucket, index);
    }
    int highest = findMax(countBucket);

    System.out.println(wordsBucket[highest]+": "+countBucket[highest]);
}

private static int findMax(Integer[] countBucket) {
    int max = 0;
    int maxIndex = 0;
    for(int i=0;i<countBucket.length;i++) {
        if(countBucket[i]==null) {
            break;
        }
        if(countBucket[i] > max) {
            max = countBucket[i];
            maxIndex = i;
        }
    }
    return maxIndex;
}

private static int findIndex(String word, String[] wordsBucket) {
    for(int i=0;i<wordsBucket.length;i++) {
        if(word.equals(wordsBucket[i])) {
            return i;
        }
        if(wordsBucket[i] == null) {
            wordsBucket[i] = word;
            return i;
        }
    }
    return -1;
}

private static void incrementIndex(Integer[] countBucket, int index) {
    if(countBucket[index] == null){
        countBucket[index] = 1;
    } else {
        countBucket[index]++;
    }
}

This prints are: 2. As @knittl pointed out in the comments this can also be done with 1 array of Pair<String, Integer> or something similar.

If you are allowed to use Maps and streams, then this works too (using the same String[] words as input as above):

Map<String, Integer> countingMap = new HashMap<>();
Arrays.stream(words).forEach(w->countingMap.compute(w, (ww,c)->c==null?1:c+1));
Map.Entry<String, Integer> h = countingMap.entrySet().stream().sorted(Comparator.comparingInt(Map.Entry<String,Integer>::getValue).reversed()).findFirst().get();
System.out.println(h);

This prints are=2

Upvotes: 2

Related Questions