Reputation: 18513
I'm trying to come up with an algorithm to solve the following problem.
Given an array of ids
var ids = [8098272281362432, 7824519999782912];
Find all the matches in an array of items.
var people = [
{
"id": 8098272281362432,
"age": 59,
"name": "Douglas Hunter"
},
{
"id": 625873891885056,
"age": 1,
"name": "Lottie Owen"
},
{
"id": 7824519999782912,
"age": 100,
"name": "Maud Wise"
},
{
"id": 2561552265773056,
"age": 115,
"name": "Annie Bennett"
}
];
Approach 1
I could solve this by sorting both arrays by id
(O(n log n)
) and then iterating through both arrays once from the top down (O(n)
)
var alg1 = function(people, ids) {
var matches = [];
var sortedPeople = people.sort(function(a, b) {
return a.id - b.id
});
var sortedIds = ids.sort(function(a, b) {
return a - b
});
var curPersonIndex = 0;
sortedIds.forEach(function(id) {
while (sortedPeople[curPersonIndex].id !== id) {
curPersonIndex++;
}
matches.push(sortedPeople[curPersonIndex]);
});
return matches;
};
Approach 2
I though I could improve this with an O(n)
algorithm by creating a mapping of ids to people and then I can just lookup the person for each id.
var alg2 = function(people, ids) {
var matches = [];
peopleMap = {};
people.forEach(function(person) {
//Is this O(1) or O(log n)?
peopleMap[person.id] = person;
});
ids.forEach(function(id) {
matches.push(peopleMap[id]);
});
return matches;
};
However when I test this out both algorithms seem to preform about the same. #1 is faster in chrome and #2 is slightly faster in Firefox.
http://plnkr.co/edit/FidAdBqS98RKebxaIlva?p=preview
I'm getting the feeling that inserting a field into an object is O(log n)
not O(1)
as I would have expected. I've read some conflicting posts on this though so I'm not sure. And I guess it may depend on the browser. Is there any way to consistently solve this with an O(n) algorithm in JavaScript?
Upvotes: 0
Views: 285
Reputation: 46408
First of all JavaScript does not require that objects be implemented as hashes (average O(1)
lookup) rather than using some other structure like a BTree that is O(log(n))
. So nobody can guarantee the performance of object attribute lookup.
But usually people use hashes. Which are O(1)
.
But as the joke says, For all intents and purposes, log(n) is a constant. For Google, it is a slightly larger constant. That captures a real truth. The difference between log(1000)
and log(1000000000)
is a factor of 3. The difference between accessing something in a CPU cache versus going to RAM is a factor of 10. While in theory O(1)
is better than O(log(n))
, in practice with the sizes of data that we actually are likely to encounter, implementation details make a far larger difference. And either can win.
So your benchmarks say nothing useful about the theoretical scaling rules. And the fact that different implementations have different performance winners for your use case is one of the unfortunate facts about the world you have to work in.
Upvotes: 4