Phrogz
Phrogz

Reputation: 303244

Efficient memoization of object arguments

Summary: Is there a faster way to hash objects than JSON.stringify?

Details: I have a Ruby and JavaScript library (NeatJSON) that provides pretty-printing of JavaScript values. I recently fixed a problem where deeply-nested objects caused O(n!) performance (n being the nesting level) using memoization based on the object being serialized and the indentation amount.

In Ruby, the fix was really easy, because you can index hashes by arrays of unique sets of objects:

build = ->(object,indent) do
  memoizer[[object,indent]] ||= <all the rest of the code>
end

In JavaScript, however, I can't index an object by another object (in a unique way). Following the lead of several articles I found online, I decide to fix the problem generically, using JSON.stringify on the full set of arguments to the function to create a unique key for memoization:

function memoize(f){
  var memo  = {};
  var slice = Array.prototype.slice;
  return function(){
    var args = slice.call(arguments);
    var mkey = JSON.stringify(args);
    if (!(mkey in memo)) memo[mkey] = f.apply(this,args);
    return memo[mkey];
  }
}

function rawBuild(o,indent){ .. }
var build = memoize(rawBuild);

This works, but (a) it's a little slower than I'd like, and (b) it seems wildly inefficient (and inelegant) to perform (naive) serialization of every object and value that I'm about to serialize smartly. The act of serializing a large object with many values is going to store a string and formatting result for EVERY unique value (not just leaf values) in the entire object.

Is there a modern JavaScript trick that would let me uniquely identify a value? For example, some way of accessing an internal ID, or otherwise associating complex objects with unique integers that takes O(1) time to find the identifier for a value?

Upvotes: 16

Views: 6560

Answers (3)

Phrogz
Phrogz

Reputation: 303244

Using @Bergi's suggestion of a WeakMap I found out about Map, which allows using any value type as the key (not just objects). Because I needed a compound key—uniquely memoizing the combination of the value passed in and the indentation string—I created a hierarchical memoization structure:

function memoizedBuild(){
  var memo = new Map;
  return function(value,indent){
    var byIndent=memo.get(value);
    if (!byIndent) memo.set(value,byIndent={});
    if (!byIndent[indent]) byIndent[indent] = rawBuild(value,indent);
    return byIndent[indent];
  }
}

This proved to be about 4× faster than the memoization code I had been using when serializing a large 270kB JSON object.

Note that in the above code I'm able to use !byIndent[indent] only because I know that rawBuild will never return a falsey value (null, undefined, false, NaN, 0, ""). The safer code line would look something like:

if (!(indent in byIndent)) byIndent[indent] = rawBuild(value,indent);

Upvotes: 4

c-smile
c-smile

Reputation: 27460

If you just need to memoise objects then it makes sense to assign some unique ID to your objects .

var gID = 0;

function createNode() {
  var obj = ...
  obj.id = (++gID).toString();
}

and use those obj.id's as keys in your memo collection.

That would be fastest and least greedy solution.

Update:

If you want that id property to do not clash with existing properties then you can create non-enumerable properties using standard ES5.1 Object.createProperty() (with some unique name) or to use ES6 symbols:

var gID = 0;
var gUidSym = Symbol("uid"); 

function getUidOf(obj) {
  return obj[gUidSym] 
      || (obj[gUidSym] = (++gID).toString());
}

Upvotes: 1

Bergi
Bergi

Reputation: 664538

If you are looking to memoise your objects by identity (not by content), then you'll want to use a WeakMap which is designed for exactly this purpose. They don't work for primitive values though, so you'll need a different solution for such arguments.

Upvotes: 6

Related Questions