Reputation: 303244
Summary: Is there a faster way to hash objects than JSON.stringify
?
Details: I have a Ruby and JavaScript library (NeatJSON) that provides pretty-printing of JavaScript values. I recently fixed a problem where deeply-nested objects caused O(n!) performance (n being the nesting level) using memoization based on the object being serialized and the indentation amount.
In Ruby, the fix was really easy, because you can index hashes by arrays of unique sets of objects:
build = ->(object,indent) do
memoizer[[object,indent]] ||= <all the rest of the code>
end
In JavaScript, however, I can't index an object by another object (in a unique way). Following the lead of several articles I found online, I decide to fix the problem generically, using JSON.stringify
on the full set of arguments to the function to create a unique key for memoization:
function memoize(f){
var memo = {};
var slice = Array.prototype.slice;
return function(){
var args = slice.call(arguments);
var mkey = JSON.stringify(args);
if (!(mkey in memo)) memo[mkey] = f.apply(this,args);
return memo[mkey];
}
}
function rawBuild(o,indent){ .. }
var build = memoize(rawBuild);
This works, but (a) it's a little slower than I'd like, and (b) it seems wildly inefficient (and inelegant) to perform (naive) serialization of every object and value that I'm about to serialize smartly. The act of serializing a large object with many values is going to store a string and formatting result for EVERY unique value (not just leaf values) in the entire object.
Is there a modern JavaScript trick that would let me uniquely identify a value? For example, some way of accessing an internal ID, or otherwise associating complex objects with unique integers that takes O(1) time to find the identifier for a value?
Upvotes: 16
Views: 6560
Reputation: 303244
Using @Bergi's suggestion of a WeakMap
I found out about Map
, which allows using any value type as the key (not just objects). Because I needed a compound key—uniquely memoizing the combination of the value passed in and the indentation string—I created a hierarchical memoization structure:
function memoizedBuild(){
var memo = new Map;
return function(value,indent){
var byIndent=memo.get(value);
if (!byIndent) memo.set(value,byIndent={});
if (!byIndent[indent]) byIndent[indent] = rawBuild(value,indent);
return byIndent[indent];
}
}
This proved to be about 4× faster than the memoization code I had been using when serializing a large 270kB JSON object.
Note that in the above code I'm able to use !byIndent[indent]
only because I know that rawBuild
will never return a falsey value (null
, undefined
, false
, NaN
, 0
, ""
). The safer code line would look something like:
if (!(indent in byIndent)) byIndent[indent] = rawBuild(value,indent);
Upvotes: 4
Reputation: 27460
If you just need to memoise objects then it makes sense to assign some unique ID to your objects .
var gID = 0;
function createNode() {
var obj = ...
obj.id = (++gID).toString();
}
and use those obj.id
's as keys in your memo
collection.
That would be fastest and least greedy solution.
Update:
If you want that id property to do not clash with existing properties then you can create non-enumerable properties using standard ES5.1 Object.createProperty() (with some unique name) or to use ES6 symbols:
var gID = 0;
var gUidSym = Symbol("uid");
function getUidOf(obj) {
return obj[gUidSym]
|| (obj[gUidSym] = (++gID).toString());
}
Upvotes: 1