Aashu
Aashu

Reputation: 31

Updating an ArrayList in a HashMap

I need to create an index of the words in several documents. The index has following format.

word ,{d1,f1,d2,f2...},value

word = several words in several documents

d1,d2,.. = name of the documents in which it appears

f1,f2... = number of times it appears in that document

value = some calculation based on number of files in which the word appears

I have created TWO classes so far. IRSystems and ReferenceCount.

ReferenceCount has documentId(d1,d2..) and Count(f1,f2..)

IRSystems has arraylist of ReferenceCount and Hashmap[String,arraylist(ReferenceCount)] I am reading all words from one document at a time and is names "tokens" I am trying to add the words in the HashMap in a way that if the word already exists in the Hashmap then look for the document to which that word belongs and if it is from the same document update the count.If its from different document add new documentId and new count to the arrayList.

So far I have done this. I have two problem it does not increase the count of the words if it is from same documnet. and i am not being able to implement "value".

HashMap<String, ArrayList<ReferenceCount>> normalList = new HashMap<String, ArrayList<ReferenceCount>>(); 

while (st.hasMoreElements()) 
        {
            String tokens = st.nextToken();
            if(normalList.size()== 0 || !normalList.containsKey(tokens) )
            {
                rList =  new ArrayList<ReferenceCount>();
                rCount = new ReferenceCount(name);
                rList.add(rCount);
                normalList.put(tokens,rList);                
            }
            else if(normalList.containsKey(tokens)  )
            {
                System.out.println("Match found");
                Iterator it = normalList.entrySet().iterator();
                while (it.hasNext())
                {
                    Map.Entry pair = (Map.Entry)it.next();
                    ArrayList<ReferenceCount> rList1 = new ArrayList<ReferenceCount>();
                    rList1 =(ArrayList)pair.getValue();
                    for( ReferenceCount rC : rList1 )
                    {
                        if(pair.getKey().equals(rC.getDocumentId()))
                        {
                            System.out.println("Match found 2 ");
                           rC.increment();
                        }
                    }
                }
            }
        }
       //to display the hashmap
Iterator it = normalList.entrySet().iterator();
            while (it.hasNext())
            {
                Map.Entry pair = (Map.Entry)it.next();
                System.out.println(pair.getKey()+ ",");
                ArrayList<ReferenceCount> rList1 = new ArrayList<ReferenceCount>();
                rList1 =(ArrayList)pair.getValue();
                for( ReferenceCount rC : rList1 )
                {
                    rCount = new ReferenceCount(name);
                    System.out.println(rCount.getDocumentId()+","+rCount.getCount());
                }


            }
       }

Upvotes: 3

Views: 402

Answers (1)

izce
izce

Reputation: 189

You used a map for the words. Why not use the same for the documentIds? You can create a HashMap of HashMaps like this:

HashMap<String, HashMap<String, Integer>> wordCountMap = 
       new HashMap<String, HashMap<String, Integer>>();

And for your values, you can create a separate HashMap with the word as key and the calculated value as the value:

HashMap<String, String> wordValueMap = new HashMap<String, String>(); 

For each word, you check the wordCountMap.containsKey(newWord), if not exists, you create the inner HashMap with the new documentId and the word-count of 1. If the key exists, you obtain the existing inner HashMap, then check whether the documentId exists, and so on...

Finally, you can maintain the calculated value separately in wordValueMap.

Upvotes: 1

Related Questions