MrFlo
MrFlo

Reputation: 339

How to extract a sub-array from two arrays with high performances?

I have two arrays of JSON objects :

e.g. :

let refarray = [{key : 1, attr1 : 'aze', ...}, {key : 1, attr1 : 'zer', ...},{key : 2, attr1 : 'ert'},...]
let otherarray = [{key : 1, attr2 : 'wxc', ...}, {key : 3, attr2 : 'xcv'},...]

I simply need to extract from refarray all elements whose key exists in otherarray.

For the moment I'm using loadash as following :

let newarray = _.filter(refarray , function(d) { return _.findIndex(otherarray , function(s) { return s.key=== d.key;}) >= 0});

But it takes between 3 and 15 seconds, which is far too long. Any quickest solution is welcome. Thanks.

Upvotes: 0

Views: 105

Answers (3)

Shidai
Shidai

Reputation: 227

Depending on the amount of duplicate keys, solution by Emil S. Jørgensen might not be optimal enough. I would go with iterating over distinct values of 1st array:

d2 = Date.now();
var distinct = [];
refarray.forEach(function(item) {
    if (distinct.indexOf(item.key) < 0) {
        distinct.push(item.key);
    }
});
console.log('Results:',otherarray.filter(function(item) {
    return distinct.indexOf(item.key) > -1;
}));
console.log('Milliseconds to filter:', Date.now() - d2);

Upvotes: 1

Harsh Gupta
Harsh Gupta

Reputation: 4538

You may try caching the keys of otherarray and then filter refarray. I tried a small sample (although I tried on node and not browser) and it was taking a little over 100 ms:

let refarray = []
let otherarray = []

for(let i of Array(60 * 1000).keys())
  refarray.push({ key: 1 + (i % 1200) })

for(let i of Array(1000).keys())
  otherarray.push({ key: i + 1 })

console.time('cache')
let cache = _.uniq(_.map(otherarray, n => n.key))
const inCache = n => cache.indexOf(n.key) !== -1

let newArray = _.filter(refarray, inCache)

console.timeEnd('cache')
console.log(refarray.length, otherarray.length, newArray.length);

Upvotes: 1

Emil S. J&#248;rgensen
Emil S. J&#248;rgensen

Reputation: 6366

Array.prototype.filter with Array.prototype.some should be the fastest approach.

//Default ref
var refarray = [{
  key: 1,
  attr1: 'aze'
}, {
  key: 2,
  attr1: 'zer'
}];
//Default other
var otherarray = [{
  key: 1,
  attr2: 'wxc'
}, {
  key: 3,
  attr2: 'xcv'
}];
//Padding ref
while (refarray.length < 10 * 1000) {
  refarray.push({
    key: 5,
    attr1: 'aze'
  })
}
//Padding other
while (otherarray.length < 60 * 1000) {
  otherarray.push({
    key: 6,
    attr2: 'aze'
  })
}
console.log('Size of refarray:', refarray.length);
console.log('Size of otherarray:', otherarray.length);
var d = Date.now();
console.log('Results:',refarray.filter(function(a) {
  return otherarray.some(function(b) {
    return b.key == a.key
  })
}));
console.log('Milliseconds to filter:', Date.now() - d);

Upvotes: 0

Related Questions