nugh
nugh

Reputation: 135

How do I count duplicate strings in an Array?

I Have looked through Stack, but none of the examples work in my case (from what I have tried).

I want to count how many times a word occurs in an array. This is done by splitting up an input String, such as "Henry and Harry went out" and counting the distinct characters of varying length (in the following example it is 2) Please forgive me if my style is bad, its my first project...

He = 1

en = 2

nr = 1

ry = 2

a = 1

an = 1

etc....... Here is my code for the constructor:

   public NgramAnalyser(int n, String inp) 
   { 
       boolean processed = false;
       ngram = new HashMap<>(); // used to store the ngram strings and count
       alphabetSize = 0;
       ngramSize = n;
       ArrayList<String> tempList = new ArrayList<String>();
       System.out.println("inp length: " + inp.length());
       System.out.println();
       int finalIndex = 0;

       for(int i=0; i<inp.length()-(ngramSize - 1); i++)
       {
           tempList.add(inp.substring(i,i+ngramSize));
           alphabetSize++;
           if(i == (inp.length()- ngramSize))
        // if i (the index) has reached the boundary limit ( before it gets an error), then...
           {
               processed = true;
               finalIndex = i;
               break;
           }
    }

       if(processed == true)
       { 
          for(int i=1; i<(ngramSize); i++)
          {
             String startString = inp.substring(finalIndex+i,inp.length());
             String endString = inp.substring(0, i);
             tempList.add(startString + endString);
          }  
       }

       for(String item: tempList)
       {
        System.out.println(item);
       }

    }
    // code for counting the ngrams and sorting them

Upvotes: 1

Views: 221

Answers (3)

HummingBird
HummingBird

Reputation: 1

This code takes the string converts it to same alphabetical case, remove spaces and turns to array. insert each value one by one, if it already exist increment its count by one other wise put the count as one. Good luck

 //take random string, convert to same case to (Lower or upper) then turn to 
character array
        char[] charArray = "This is an example text".replaceAll("\\s","").toLowerCase().toCharArray();
        System.out.println(Arrays.toString(charArray));
        Map<Character, Integer> charCount = new HashMap<>();
        for (char c : charArray){
            //if key doesnt exist put it and update count value to 1
            if(!charCount.containsKey(c)){
                charCount.put(c, 1);
            }else{
                //if key exist increment value by 1
                charCount.put(c, charCount.get(c) + 1);
            }
        }

        System.out.println(charCount.toString());

output:

[t, h, i, s, i, s, a, n, e, x, a, m, p, l, e, t, e, x, t]
{p=1, a=2, s=2, t=3, e=3, h=1, x=2, i=2, l=1, m=1, n=1}

Upvotes: 0

Ted Cassirer
Ted Cassirer

Reputation: 364

This method creates a HashMap with the keys being the different items and the values the item count. I think the code is pretty easy to understand but ask if there's something that isn't clear or might be wrong

public Map<String, Integer> ngram(String inp, Integer n)
{
    Map<String, Integer> nGram = new HashMap<>();
    for(int i = 0; i < inp.length() - n - 1; i++)
    {
        String item = inp.substring(i, i+n);
        int itemCount = nGram.getOrDefault(item, 0);
        nGram.put(item, itemCount+1);
    }
    return nGram;
}

Upvotes: 0

freedev
freedev

Reputation: 30197

A simple solution should use the Map<String, Integer> ngram and, while iterating on your list of ngram, for each key (aka String) found in your input update the counter (aka Integer).

Upvotes: 2

Related Questions