Reputation: 5236
I have a String list as below. I want to do some calculations based on if this list has multiple elements with same value.
I got nearly 120k elements and when I run this code it runs too slow. Is there any faster approach than contains method?
List<String> words= getWordsFromDB(); //words list has nearly 120k elements
List<String> tempWordsList = new LinkedList<String>(); //empty list
String[] keys = getKeysFromDB();
List<String> tempKeysList = new LinkedList<String>();
for (int x = 0; x < words.size(); x++) {
if (!tempWordsList.contains(words.get(x))) {
tempWordsList.add(words.get(x));
String key= keys[x];
tempKeysList.add(key);
} else {
int index = tempWordsList.indexOf(words.get(x));
String m = tempKeysList.get(index);
String n = keys[x];
if (!m.contains(n)) {
String newWord = m + ", " + n;
tempKeysList.set(index, newWord);
}
}
}
EDIT: words list comes from database and problem is there is a service continuously updating and inserting data to this table. I don't have any access to this service and there are other applications who is using the same table.
EDIT2: I have updated for full code.
Upvotes: 1
Views: 89
Reputation: 50716
LinkedList.get()
runs in O(N) time. Either use ArrayList
with O(1) lookup time, or avoid indexed lookups altogether by using an iterator:
for (String word : words) {
if (!tempList.contains(word)) {
tempList.add(word);
} else {
int firstIndex = tempList.indexOf(word);
//do some calculations
}
}
Disclaimer: The above was written under the questionable assumption that words
is a LinkedList
. I would still recommend the enhanced-for loop, since it's more conventional and its time complexity is not implementation-dependent. Either way, the suggestion below still stands.
You can further improve by replacing tempList
with a HashMap
. This will avoid the O(N) cost of contains()
and indexOf()
:
Map<String, Integer> indexes = new HashMap<>();
int index = 0;
for (String word : words) {
Integer firstIndex = indexes.putIfAbsent(word, index++);
if (firstIndex != null) {
//do some calculations
}
}
Based on your latest update, it looks like you're trying to group "keys" by their corresponding "word". If so, you might give streams a spin:
List<String> words = getWordsFromDB();
String[] keys = getKeysFromDB();
Collection<String> groupedKeys = IntStream.range(0, words.size())
.boxed()
.collect(Collectors.groupingBy(
words::get,
LinkedHashMap::new, // if word order is significant
Collectors.mapping(
i -> keys[i],
Collectors.joining(", "))))
.values();
However, as mentioned in the comments, it would probably be best to move this logic into your database query.
Upvotes: 2
Reputation: 2098
//1) if you can avoid using linked list use below solution
List<String> words= getWordsFromDB(); //words list has nearly 120k elements
//if you can avoid using linked list, use set instead
Set<String> set=new HashSet<>();
for (String s:words) {
if (!set.add(s)) {
//do some calculations
}
}
//2) if you can't avoid using linked list use below code
List<String> words= getWordsFromDB(); //words list has nearly 120k elements
List<String> tempList = new LinkedList<String>(); //empty list
//if you can't avoid LinkedListv (tempList) you need to use a set
Set<String> set=new HashSet<>();
for (String s:words) {
if (set.add(s)) {
tempList.add(s);
} else {
int a = tempList.indexOf(s);
//do some calculations
}
}
Upvotes: 2
Reputation: 131346
Acutally, tempList
use linear complexity time methods :
if (!tempList.contains(words.get(x))) {
and
int a = tempList.indexOf(words.get(x));
It means that at each invocation of them, the list is in average iterate at half.
Besides, these are redundant.
indexOf()
only could be invoked :
for (int x = 0; x < words.size(); x++) {
int indexWord = tempList.indexOf(words.get(x));
if (indexWord != -1) {
tempList.add(words.get(x));
} else {
//do some calculations by using indexWord
}
}
But to improve all accesses, you should change your structure : wrapping or replacing LinkedList
by LinkedHashSet
.
LinkedHashSet
would keep the actual behavior because as List
, it defines the iteration ordering, which is the order in which elements were inserted into the set but it also uses hashing feature to improve time access to its elements.
Upvotes: 0
Reputation: 310884
You are searching the list twice per word: once for contains()
and once for indexOf()
. You could replace contains()
by indexOf()
, test the result for -1, otherwise reuse the result instead of calling indexOf()
again. But you are certainly using the wrong data structure. What exactly do you need a
for? Do you need a
? I would use a HashSet
, or a HashMap
if you need to associate other data with each word.
Upvotes: 2