teamon
teamon

Reputation: 331

Composing json from cached strings in ruby

Consider the following scenario, I have quite a few big hashes that I need to put in an array and then convert it to json:

hash1 = { ... big hash ... }
hash2 = { ... big hash ... }
hash3 = { ... big hash ... }
array = [hash1, hash2, hash3]

json = JSON.dump(array)

The problem is that generating json from those hashes takes a long time, so I would like to cache it. However, I can't cache the whole array, only separate items. Obviously putting cached json string in array gives bad results:

hash1 = {:a => 1}
hash1json = JSON.dump(hash1)
array = [hash1json]
JSON.generate(array)
==> ["{\"a\":1}"]

while I need

==> [{"a":1}]

The only way I can think of is doing something like this:

"[#{[hash1json].join(",")}]"
==> [{"a":1}]

which might be enough for this specific case, but it will be much harder if one wanted to cache some deep structure instead of simple array.

Upvotes: 3

Views: 246

Answers (2)

teamon
teamon

Reputation: 331

Turns out this is actually dead simple:

class CachedJson
  def initialize(str)
    @str = str
  end

  def to_json
    @str
  end
end

puts Yajl::Encoder.encode(:data => [{:a => 1}, '{"b":2}'])
# => {"data":[{"a":1},"{\"b\":2}"]}

puts Yajl::Encoder.encode(:data => [{:a => 1}, CachedJson.new('{"b":2}')])
# => {"data":[{"a":1},{"b":2}]}

Under the hood yajl calls to_json on every object, and this method must return string, so it's just a matter of wrapping cached json string with CachedJson object

Upvotes: 2

fmendez
fmendez

Reputation: 7338

EDIT

My previous answer missed the performance aspect of the question completely (sorry about that), so this is my finding. Perhaps it can help you a little.

Apparently in these situations using yajl-ruby, which is a binding to the C yajl library, seems to improve performance while doing the transformation. For instance, here I'm generating a hash with 10,000 entries:

  require 'json'
  require 'yajl'
  require 'benchmark'
  tmp = "" 
  10000.times do |i|
   tmp += "\"#{i}\" => \"#{i}\", " 
  end

 domains = eval("{#{tmp}}")

 puts "JSON DUMP #{Benchmark.measure { JSON.dump(domains) }} "

 puts "Yajl::Encoder #{Benchmark.measure { Yajl::Encoder.encode(domains)}}"

And these are the results:

JSON DUMP   0.010000   0.000000   0.010000 (  0.007495)

Yajl::Encoder   0.000000   0.000000   0.000000 (  0.003542)

Consistenly im halving the time for the task of transforming to json. Hope it helps!

Upvotes: 0

Related Questions