thomas
thomas

Reputation: 177

Filter large arrays using node js

I have 2 large arrays populated with strings both containing > 150.000 elements

const allNew = allArrayOneValues.filter(val => !allArrayTwoValues.includes(val));

I need to compare the two arrays like this to find out which elements are not in ArrayTwo yet or to find out which elements to delete from ArrayTwo as they are no longer in ArrayOne.

Filtering here takes around 3 to 5 minutes... is there a way to do a far more efficient compared to find out which values in ArrayOne are not yet in ArrayTwo OR which values are in ArrayTwo which are not in ArrayOne...

Thanks Thomas

Upvotes: 1

Views: 990

Answers (2)

Sachin
Sachin

Reputation: 3540

Your current algorithm is O(m*n) (where m= length of first array, n= length of second array)

Using an efficient data-structure that can do sub-linear lookup, it's possible to this in atleast O(m*lg(n))

So for 150,000 elements, it would be 10 thousand times faster and should take a few milliseconds instead of minutes.

let allArrayOneValues = [1, 2, 4]
let allArrayTwoValues = [3, 9, 2]
let hash = new Set();
allArrayTwoValues.forEach((value) => {
  hash.add(value)
})

const allNew = allArrayOneValues.filter(val => !hash.has(val));
console.log(allNew)

Upvotes: 2

Debug Anywhere
Debug Anywhere

Reputation: 11

use Set or Object may be a good choice. Here is an example:

// convert allArrayTwoValues to Object.
const tmp = allArrayTwoValues.reduce((prev, item) => { prev[item] = true; return prev; }, {});
// filter values not in allArrayTwoValues
const allNew = allArrayOneValues.filter(item => !tmp[item]);

Upvotes: 1

Related Questions