Reputation: 711
This question is a continuation of this thread:
In short: To solve my problem, I want to use Map<Set<String>, String>
.
However, after I sort my data entries in Excel, remove the unnecessary parameters, and the following came out:
flow content ==> content content
flow content ==> content depth distance
flow content ==> content depth within
flow content ==> content depth within distance
flow content ==> content within
flow content ==> content within distance
I have more than one unique key for the hashmap if that is the case. How do I go around this... anyone have any idea?
I was thinking of maybe Map<Set <String>, List <String>>
so that I can do something like:
Set <flow content>, List <'content content','content depth distance','content depth within ', ..., 'content within distance'>
But because I am parsing the entries line by line I can't figure out the way how to store values of the same repeated keys (flow content) into the same list and add it to the map.
Anyone have a rough logic on how can this be done in Java?
Thanks in advance.
--EDIT:
Trying Multimap but somehow have slight problem:
public static void main(String[] args) {
File file = new File("apriori.txt");
Multimap<Set <String>, String> mm = HashMultimap.create();
Set<String> s = null;
List l = null;
BufferedReader br = null;
try {
br = new BufferedReader(new FileReader(file));
String line = "";
while ((line = br.readLine()) != null) {
//Regex delete only tokenize
String[] string = line.split(";");
System.out.println(string[0] + " " + string[1]);
StringTokenizer st = new StringTokenizer(string[0].trim());
while (st.hasMoreTokens()) {
//System.out.println(st.nextToken());
s = new HashSet<String>();
s.add(st.nextToken());
}
mm.put(s,string[1]);
}
// dispose all the resources after using them.
br.close();
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
Set<String> t = new HashSet<String>();
t.add("content");
t.add("by");
String str = mm.get(t).toString();
System.out.println(str);
for (Map.Entry<Set <String>, String> e : mm.entries()) {
System.out.println(e);
}
}
The apriori.txt
byte_jump ; msg
byte_jump ; msg by
content ; msg
content by ; flow
content by ; msg
content by ; msg flow
content by byte_jump ; msg
content byte_jump ; by
content byte_jump ; msg
content byte_jump ; msg by
Apparently the output for the forloop:
[content]= msg
[by]= flow
[by]= msg
[by]= msg flow
[byte_jump]= msg
[byte_jump]= by
[byte_jump]= msg by
instead of [content by]= msg flow
Why is that so? I tried and it works. But I need Set to compare the strings regardless of position. What can I do?
Upvotes: 0
Views: 1691
Reputation: 346377
Regarding your code with MultiMap
: the only thing you're doing wrong is to create a new set for every token instead of putting all the tokens of a line into the same set. That's also why you're missing tokens. This works:
s = new HashSet<String>();
while (st.hasMoreTokens()) {
//System.out.println(st.nextToken());
s.add(st.nextToken());
}
Upvotes: 2
Reputation: 5120
public static void main(String[] args) throws IOException {
final File file = new File("apriori.txt");
final Multimap<String, String> map = HashMultimap.create();
final BufferedReader reader = new BufferedReader(new FileReader(file));
while (true) {
final String line = reader.readLine();
if (line == null) break;
final String[] parts = line.split(" ; ");
map.put(parts[0].trim(), parts[1].trim());
}
for (Map.Entry<String, String> e : map.entries()) {
System.out.println(e);
}
}
Should do the trick. (I didn't compile it, no guarantees though.)
Make sure you use Multimap<String, String>
, no need to use a single element set as a key there.
Upvotes: 1
Reputation: 160992
A multimap allows multiple values for a specific key.
One implementation is the various Multimap
s which are provided as part of Google Collections.
Rather than coding a way to correctly store data into a Map<String, List<String>
, it would probably be a better choice to go ahead and use the appropriate data structure for the job.
Upvotes: 1
Reputation: 21805
The logic is essentially:
As another poster has mentioned, you could consider a standard multi-map library class such as that provided in Google Collections. (I personally would just implement it myself because it's really simple and doesn't really warrant a whole additional library in my view, but mileage varies.)
Upvotes: 2