Reputation: 121
I'm developing a Java Application that reads a lot of strings data likes this:
1 cat (first read)
2 dog
3 fish
4 dog
5 fish
6 dog
7 dog
8 cat
9 horse
...(last read)
I need a way to keep all couple [string, occurrences] in order from last read to first read.
string occurrences
horse 1 (first print)
cat 2
dog 4
fish 2 (last print)
Actually i use two list:
1) List<string> input;
where i add all data
In my example:
input.add("cat");
input.add("dog");
input.add("fish");
...
2)List<string> possibilities;
where I insert the strings once in this way:
if(possibilities.contains("cat")){
possibilities.remove("cat");
}
possibilities.add("cat");
In this way I've got a sorted list where all possibilities. I use it like that:
int occurrence;
for(String possible:possibilities){
occurrence = Collections.frequency(input, possible);
System.out.println(possible + " " + occurrence);
}
That trick works good but it's too slow(i've got millions of input)... any help?
(English isn’t my first language, so please excuse any mistakes.)
Upvotes: 2
Views: 3760
Reputation: 3160
What you could do:
Code:
/* I don't know what logic you use to create the input list,
* so I'm using your input example. */
List<String> input = Arrays.asList("cat", "dog", "fish", "dog",
"fish", "dog", "dog", "cat", "horse");
/* by the way, this changes the input list!
* Copy it in case you need to preserve the original input. */
Collections.reverse(input);
Set<String> possibilities = new LinkedHashSet<String>(strings);
for (String s : possibilities) {
System.out.println(s + " " + Collections.frequency(strings, s));
}
Output:
horse 1
cat 2
dog 4
fish 2
Upvotes: 0
Reputation: 34628
If you know that your data is not going to exceed your memory capacity when you read it all into memory, then the solution is simple - using a LinkedList
or a and a LinkedHashMap
.
For example, if you use a Linked list:
LinkedList<String> input = new LinkedList();
You then proceed to use input.add()
as you did originally. But when the input list is full, you basically use Jordi Castilla's solution - but put the entries in the linked list in reverse order. To do that, you do:
Iterator<String> iter = list.descendingIterator();
LinkedHashMap<String,Integer> map = new LinkedHashMap<>();
while (iter.hasNext()) {
String s = iter.next();
if ( map.containsKey(s)) {
map.put( s, map.get(s) + 1);
} else {
map.put(s, 1);
}
}
Now, the only real difference between his solution and mine is that I'm using list.descendingIterator()
which is a method in LinkedList
that gives you the entries in backwards order, from "horse" to "cat".
The LinkedHashMap
will keep the proper order - whatever was entered first will be printed first, and because we entered things in reverse order, then whatever was read last will be printed first. So if you print your map
the result will be:
{horse=1, cat=2, dog=4, fish=2}
If you have a very long file, and you can't load the entire list of strings into memory, you had better keep just the map of frequencies. In this case, in order to keep the order of entry, we'll use an object such as this:
private static class Entry implements Comparable<Entry> {
private static long nextOrder = Long.MIN_VALUE;
private String str;
private int frequency = 1;
private long order = nextOrder++;
public Entry(String str) {
this.str = str;
}
public String getString() {
return str;
}
public int getFrequency() {
return frequency;
}
public void updateEntry() {
frequency++;
order = nextOrder++;
}
@Override
public int compareTo(Entry e) {
if ( order > e.order )
return -1;
if ( order < e.order )
return 1;
return 0;
}
@Override
public String toString() {
return String.format( "%s: %d", str, frequency );
}
}
The trick here is that every time you update the entry (add one to the frequency), it also updates the order. But the compareTo()
method orders Entry
objects from high order (updated/inserted later) to low order (updated/inserted earlier).
Now you can use a simple HashMap<String,Entry>
to store the information as you read it (I'm assuming you are reading from some sort of scanner):
Map<String,Entry> m = new HashMap<>();
while ( scanner.hasNextLine() ) {
String str = scanner.nextLine();
Entry entry = m.get(str);
if ( entry == null ) {
entry = new Entry(str);
m.put(str, entry);
} else {
entry.updateEntry();
}
}
Scanner.close();
Now you can sort the values of the entries:
List<Entry> orderedList = new ArrayList<Entry>(m.values());
m = null;
Collections.sort(orderedList);
Running System.out.println(orderedList)
will give you:
[horse: 1, cat: 2, dog: 4, fish: 2]
In principle, you could use a TreeMap
whose keys contained the "order" stuff, rather than a plain HashMap
like this followed by sorting, but I prefer not having either mutable keys in a map, nor changing the keys constantly. Here we are only changing the values as we fill the map, and each key is inserted into the map only once.
Upvotes: 0
Reputation: 761
Here is the complete solution for your problem,
import java.util.ArrayList;
import java.util.Collections;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
public class DataDto implements Comparable<DataDto>{
public int count = 0;
public String string;
public long lastSeenTime;
public DataDto(String string) {
this.string = string;
this.lastSeenTime = System.currentTimeMillis();
}
public boolean equals(Object object) {
if(object != null && object instanceof DataDto) {
DataDto temp = (DataDto) object;
if(temp.string != null && temp.string.equals(this.string)) {
return true;
}
}
return false;
}
public int hashcode() {
return string.hashCode();
}
public int compareTo(DataDto o) {
if(o != null) {
return o.lastSeenTime < this.lastSeenTime ? -1 : 1;
}
return 0;
}
public String toString() {
return this.string + " : " + this.count;
}
public static final void main(String[] args) {
String[] listOfAllStrings = {"horse", "cat", "dog", "fish", "cat", "fish", "dog", "cat", "horse", "fish"};
Map<String, DataDto> results = new HashMap<String, DataDto>();
for (String s : listOfAllStrings) {
DataDto dataDto = results.get(s);
if(dataDto != null) {
dataDto.count = dataDto.count + 1;
dataDto.lastSeenTime = System.nanoTime();
} else {
dataDto = new DataDto(s);
results.put(s, dataDto);
}
}
List<DataDto> finalResults = new ArrayList<DataDto>(results.values());
System.out.println(finalResults);
Collections.sort(finalResults);
System.out.println(finalResults);
}
}
Ans
[horse : 1, cat : 2, fish : 2, dog : 1]
[fish : 2, horse : 1, cat : 2, dog : 1]
I think this solution will be suitable for your requirement.
Upvotes: 0
Reputation: 19204
Make use of a TreeMap, which will keep ordering on the keys as specified by the compare
of your MyStringComparator class handling MyString class which wraps String adding insertion indexes, like this:
// this better be immutable
class MyString {
private MyString() {}
public static MyString valueOf(String s, Long l) { ... }
private String string;
private Long index;
public hashcode(){ return string.hashcode(); }
public boolean equals() { // return rely on string.equals() }
}
class MyStringComparator implements Comparator<MyString> {
public int compare(MyString s1, MyString s2) {
return -s1.getIndex().compareTo(s2.gtIndex());
}
}
Pass the comparator while constructing the map:
Map<MyString,Integer> map = new TreeMap<>(new MyStringComparator());
Then, while parsing your input, do
Long counter = 0;
while (...) {
MyString item = MyString.valueOf(readString, counter++);
if (map.contains(item)) {
map.put(map.get(item)+1);
} else {
map.put(item,1);
}
}
There will be a lot of instantiation because of the immutable class, and the comparator will not be consistent with equals, but it should work.
Disclaimer: this is untested code just to show what I'd do, I'll come back and recheck it when I get my hands on a compiler.
Upvotes: 0
Reputation: 26971
Use a Map<String, Integer>
, as @radoslaw pointed, to keep the insertion sorting use LinkedHashMap
and not a TreeMap
as described here:
LinkedHashMap
keeps the keys in the order they were inserted, while aTreeMap
is kept sorted via a Comparator or the natural Comparable ordering of the elements.
Imagine you have all the strings in some array, call it listOfAllStrings
, iterate over this array and use the string as key
in your map, if it does not exists, put in the map, if it exists, sum 1 to actual result...
Map<String, Integer> results = new LinkedHashMap<String, Integer>();
for (String s : listOfAllStrings) {
if (results.get(s) != null) {
results.put(s, results.get(s) + 1);
} else {
results.put(s, 1);
}
}
Upvotes: 1