Yohan
Yohan

Reputation: 45

Displayed nested HashMaps with a TreeSet to display index of the word in a certain file

I want to display ( in JSON format ) a list of words contained in a file, with first key being the word, second key being the file it's from and values are the index of the word when it's found in the file.Only problem is it seems like the new TreeSet has a problem because all my words have the same HashMap<String, TreeSet>. They all have the same nested HashMap but I want every one of them to be individual and independent of course. Would love a little help. Here is my code:

public static HashMap<String, HashMap<String, TreeSet<Integer>>> listStems(Path inputFile) throws 
IOException {
    HashMap<String, HashMap<String, TreeSet<Integer>>> finalString = new HashMap<String, 
    HashMap<String, TreeSet<Integer>>>();
    HashMap<String, TreeSet<Integer>> mapString = new HashMap<String, TreeSet<Integer>>();
    int counter=0;
    Stemmer stemmer = new SnowballStemmer(DEFAULT);
    try (BufferedReader br =
            new BufferedReader(new InputStreamReader(
                    new FileInputStream(inputFile.toString()), "UTF-8"));) {    
                String line;
                while((line = br.readLine()) != null) {
                    String[] toStemArray = parse(line);
                    
                    for(int i = 0;i<toStemArray.length;i++) {
                        counter++;
                        if(!finalString.containsKey(toStemArray[i])) {
                            mapString.put(inputFile.toString(), new TreeSet<Integer>());
                            finalString.put(toStemArray[i], mapString);
                            finalString.get(toStemArray[i]).get(inputFile.toString()).add(counter);
                        }
                        else if(finalString.containsKey(toStemArray[i])) {
                            finalString.get(toStemArray[i]).get(inputFile.toString()).add(counter);
                        }
                    }
                }
    }       
    return finalString;
}

Upvotes: 0

Views: 57

Answers (1)

gabeg
gabeg

Reputation: 139

All of your HashMap<String, TreeSet> instances are the same because you only create a single instance (mapString) at the start of your method, then re-use it.

In your inner if statement, you check to see if you've seen the word before, and if you haven't you add an entry to your one single HashMap<String, TreeSet> that maps the file name to a new, empty TreeSet<Integer>. That's almost the right pattern--you're detecting a new word and creating a new TreeSet, but not creating a new HashMap<String, TreeSet>.

If you want to have one HashMap<String, TreeSet> per word, you'll need to create a new one every time you see a new word instead of just once. Move your new HashMap<String, TreeSet<Integer>>() to immediate before the mapString.put line and you'll almost have it working: you'll have one HashMap<String, TreeSet> per word but now you're only creating a single TreeSet.

Fix that the same way (by making a new TreeSet if you haven't seen that file for that word before) and you should be good!

Upvotes: 1

Related Questions