Reputation: 35
I have this function:
public void insert( String token, int docID)
{
insertNormIdx( token, docID);
}
which is being called continuously by the main program. docID is one document's ID and token is one word we found in the document.
So this function is being called many times, until all documents have been parsed. What I want to do, is to create one hashmap that has one entry docID and this should point to another hashmap that has the words (tokens) we found in the document with their count.
That is, if we found the word (token) 'the' 10 times in document (docID) '5' , I would like a structure that holds this information like: 5,the,10.
This is what i have done, but it doesnt really work, only keeps the first word from the documents:
HashMap<Integer, HashMap<String, Integer>> normal_idx = new HashMap<Integer, HashMap<String, Integer>>();
public void insertNormIdx(String token, int docID)
{
HashMap<String, Integer> val = new HashMap<String, Integer>();
if(!normal_idx.containsKey(docID))
{
val.put(token, 1);
normal_idx.put(docID, val);
}
if (normal_idx.containsKey(docID))
{
if (normal_idx.get(docID).get(token)!=null)
{
val.put(token, normal_idx.get(docID).get(token)+1);
normal_idx.put(docID, val);
}
}
}
Upvotes: 2
Views: 789
Reputation: 30839
You can use Java 8's computeIfAbsent
method to put/merge values in map, e.g.:
public void insertNormIdx(String token, int docID) {
normal_idx.computeIfAbsent(docID, k -> new HashMap<>()).merge(token, 1, (old, one) -> old + one);
}
Upvotes: 1
Reputation: 2006
Better way to do it:
public void insertNormIdx(String token, int docID) {
Map<String, Integer> doc = normal_idx.get(docId);
if (doc == null) {
normal_idx.put(docId, doc = new HashMap<String, Integer>());
}
Integer counter = doc.get(token);
if (counter == null)
doc.put(token, 1);
else
doc.put(token, ++counter);
}
And by the way, do not use just bare HashMap, create class Document
.
Upvotes: 1
Reputation: 50734
There's a lot of redundancy and mistakes in your code. The specific problem in your question is because there's no else
to this if
:
if (normal_idx.get(docID).get(token)!=null)
Therefore new tokens are never inserted.
But the whole code can be significantly improved. In Java 8, you can replace the entire method with:
normal_idx.computeIfAbsent(docID, k -> new HashMap<>())
.merge(token, 1, Integer::sum);
If you're on an earlier Java version, you can try this:
HashMap<String, Integer> val = normal_idx.get(docID);
if (val == null) {
val = new HashMap<String, Integer>();
normal_idx.put(docID, val);
}
Integer count = val.get(token);
if (count == null) {
val.put(token, 1);
} else {
val.put(token, count + 1);
}
Upvotes: 2