Reputation: 388
I have a big list<String>
that is about 50,000 records. I want an effective way to search for a specific subString
in that List
and get the Strings
that contains that subString
.
My code is like this so far:
List<String> result = new ArrayList<>();
if (aCondition) {
for (String file : arg) {
if (file.toLowerCase().contains(tag.toLowerCase())) {
result.add(file);
}
}
}
return result;
Upvotes: 0
Views: 1095
Reputation: 14
If your processor has more than one core just go and use parallel streams.
List<String> result = lines.parallelStream() //convert list to parallel stream
.filter(line -> file.toLowerCase().contains(tag.toLowerCase())) // check your condition
.collect(Collectors.toList()); // collect output
The above code will process your strings faster if your processor has more than one core because a parallel stream is opened.
Upvotes: 0
Reputation: 494
Consider to use a SQL database to hold big amounts of data.
In this way you can use a simple query to get a result String containing a substring (look at example below). Furthermore your memory will be free of that amount of data loaded in list.
e.g.
SELECT * from word_list_table WHERE word LIKE'%substring%'
Upvotes: 0
Reputation: 140623
It depends on what you mean by effective.
If you want to get to "minimal" CPU usage, then there isn't much you can do: you have to iterate that list; and compare all entries. The only obvious thing to not do: call tag.toLowerCase()
for each loop body. Just compute that value once before entering the loop!
If you care about getting result in less time, the answer is simple: use multiple threads, and have each thread search a "slice" of the overall list (of course, that can turn complicated quickly, as you now have to preserve order and other subtle things).
Finally: you might want to look into tools such ElasticSearch - as there are various products designed to exactly do that: search huge amounts of text.
Upvotes: 1