Reputation: 2274
I am working on dictionary application written with react-native.
When I want to filter the array from the search box, I wrote below function. This is working quite good when I test with 2000 word list. But when the word list goes to thousands the search speed is really slow.
So, how can I improve this search function?
//Filter array when input text (Search)
let filteredWords = []
if(this.state.searchField != null)
{
filteredWords = this.state.glossaries.filter(glossary => {
return glossary.word.toLowerCase().includes(this.state.searchField.toLowerCase());
})
}
Upvotes: 3
Views: 6129
Reputation: 3102
There are multiple factors that are making this code slow:
filter()
with a lambda. This adds a function call overhead for each item being searched.toLowercase()
on both strings before calling includes()
. This will allocate two new string objects for every comparison.includes
. For some reason the includes()
method is not as well optimized in some browsers as indexOf()
.for
loop (-11%)Instead of using the filter()
method, I recommend creating a new Array
and using a for
loop to fill it.
const glossaries = this.state.glossaries;
const searchField = this.state.searchField;
const filteredWords = [];
for (let i = 0; i < glossaries.length; i++) {
if (glossaries[i].toLowerCase().includes(searchField.toLowerCase())) {
filteredWords.push(glossaries[i]);
}
}
Memory allocation is expensive due to the fact that JavaScript uses garbage collection mechanism for freeing used memory. When a garbage collection is performed the whole program is paused while it tries to finds memory which is not used anymore.
You can get rid of the toLowerCase()
(inside the search loop) completely by making a copy of the glossary everytime the glossary is updated, which I assume is not often.
// When you build the glossary
this.state.glossaries = ...;
this.state.searchGlossaries = this.state.glossaries.map(g => g.toLowerCase());
You can also remove the toLowerCase()
on the searchText by calling it once before the loop. After these changes, the code will look like:
const glossaries = this.state.glossaries;
const searchGlassaries = this.state.searchGlossaries;
const searchField = this.state.searchField.toLowerCase();
const filteredWords = [];
for (let i = 0; i < glossaries.length; i++) {
if (searchGlassaries[i].includes(searchField)) {
filteredWords.push(glossaries[i]);
}
}
indexOf()
instead of includes()
(-13%)I am not really sure why this is the case, but tests show that indexOf
is a lot faster than includes
.
const glossaries = this.state.glossaries;
const searchGlassaries = this.state.searchGlossaries;
const searchField = this.state.searchField.toLowerCase();
const filteredWords = [];
for (let i = 0; i < glossaries.length; i++) {
if (searchGlassaries[i].indexOf(searchField) !== -1) {
filteredWords.push(glossaries[i]);
}
}
Overall the performance has improved by 70%. I got the performance percentages from https://jsperf.com/so-question-perf
In the comments you said you would like an example of optimizations that can be done when the requirements are loosened to only match words that start with the search text. One way to do this is a binary search.
Let's take the code from above as starting point. We sort the glossaries before we store it in the state. For sorting case insensitively, JavaScript exposes the Intl.Collator
constructor. It provides the compare(x, y)
method that returns:
negative value | X is less than Y
zero | X is equal to Y
positive value | X is greater than Y
And the resulting code:
// Static in the file
const collator = new Intl.Collator(undefined, {
sensitivity: 'base'
});
function binarySearch(glossaries, searchText) {
let lo = 0;
let hi = glossaries.length - 1;
while (lo <= hi) {
let mid = (lo + hi) / 2 | 0;
let comparison = collator.compare(glossaries[mid].word, searchText);
if (comparison < 0) {
lo = mid + 1;
}
else if (comparison > 0) {
hi = mid - 1;
}
else {
return mid;
}
}
return -1;
}
// When you build the glossary
this.state.glossaries = ...;
this.state.glossaries.sort(function(x, y) {
return collator.compare(x.word, y.word);
});
// When you search
const glossaries = this.state.glossaries;
const searchField = this.state.searchField.toLowerCase();
const filteredWords = [];
const idx = binarySearch(glossaries, searchField);
if (idx != -1) {
// Find the index of the first matching word, seeing as the binary search
// will end up somewhere in the middle
while (idx >= 0 && collator.compare(glossaries[idx].word, searchField) < 0) {
idx--;
}
// Add each matching word to the filteredWords
while (idx < glossaries.length && collator.compare(glossaries[idx].word, searchField) == 0) {
filteredWords.push(glossaries[idx]);
}
}
Upvotes: 9
Reputation: 31712
As the question doesn't seem to belong on CodeReview, I think there are a few things that you can do to make your code drastically faster [citation needed]:
this.state.searchField.toLowerCase()
as you don't need to call it on every iteration.for
loops instead of flashy-but-slow Array
functions.And here is the final result:
let filteredWords = []
if(this.state.searchField != null) {
let searchField = this.state.searchField.toLowerCase(),
theArray = this.state.glossaries; // cache this too
for(let i = 0, l = theArray.length; i < l; ++i) {
if(theArray[i].word.toLowerCase().includes(searchField)) {
filteredWords.push(theArray[i]);
}
}
}
Edit:
If you want to search for glossaries whose word
start with searchField
, then use indexOf === 0
instead of includes
as the condition like this:
if(theArray[i].word.toLowerCase().indexOf(searchField) === 0) {
Upvotes: 2