glitchtracker
glitchtracker

Reputation: 305

Javascript : more clever way to filter thousands of row against thousands of values

I've rows with 2 columns: product SKU and category ids I need to only return the product SKU and their cat_ids that match a list of cat_ids in SQL that would be :

SELECT SKU,cat_id FROM myTable where cat_id IN(my_huge cat_ids_list)

but I need to do this in Javascript from tab tab return data.

obviously, I could do it with two nested loops, but that would result in millions of comparisons and that would be too slow

if I have 30000 SKUs, to match against 1000 cat_ids that would at worst be 30M tests.

so is there a cleverer algorithmic approach that would filter this more globally, without needing so much tests ?

Many thanks

Upvotes: 0

Views: 1410

Answers (2)

Mark
Mark

Reputation: 92460

You don't need nested loops. Use a Set, which is constant time lookups. You need to create the set from the array of categories and then use set.has() to filter:

let rows = [
    {sku: 1, cat_id:100},
    {sku: 11, cat_id:10},
    {sku: 12, cat_id:10},
    {sku: 13, cat_id:20},
    {sku: 14, cat_id:10},
    {sku: 15, cat_id:20},
    {sku: 16, cat_id:100}
]
//original array of ids to match
let matchCats = [10, 20]

// make a set from it
let matchSet = new Set(matchCats)

// filter with has()
let filtered = rows.filter(item => matchSet.has(item.cat_id))
console.log(filtered)

This should get you linear time filtering.

Edit based on comment
If you can't use ES6, you can still take advantage of the fact that regular javascript objects can be used like hashes with constant key lookup. It's not as pretty, but this should still work in linear time:

let rows = [
    {sku: 1, cat_id:100},
    {sku: 11, cat_id:10},
    {sku: 12, cat_id:10},
    {sku: 13, cat_id:20},
    {sku: 14, cat_id:10},
    {sku: 15, cat_id:20},
    {sku: 16, cat_id:100}
]

let matchCats = [10, 20]

// make an object of key/null-value pairs
let matchObj = matchCats.reduce((obj, item) => (obj[item] = null, obj), {})
let filtered = rows.filter(item => item.cat_id in matchObj)
console.log(filtered)

Upvotes: 1

Henning Koehler
Henning Koehler

Reputation: 2657

Use a hashset for your list of cat_ids, which gives you constant-time containment checks. Not too familiar with javascript, but I believe the Set type should do the trick.

https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Set

Upvotes: 1

Related Questions