Reputation: 1325
My array has 75,000 records and looks like this:
orders = [{ :order_id=>"7617",
:date=>"2014-11-17 19:24:31",
:firstname=>"Jack",
:lastname=>"Bauer"},
{ :order_id=>"7618",
:date=>"2014-11-17 19:34:51",
:firstname=>"James",
:lastname=>"Bond"},
... ]
I now need to loop through this array with the following code:
order_id_array = []
order_array = []
orders.each do |order|
prepared_order = prepare_order(order)
order_id_array << prepared_order[0]
order_array << prepared_order[1]
end
def prepare_order(order)
order_id = order[:order_id]
[ order_id,
{ :order_id => order_id,
:name => "#{order[:firstname]} #{order[:lastname]}",
:date => Time.zone.parse(order[:date]),
:customer_id => Moped::BSON::ObjectId.new } ]
end
This process take about 15 seconds. That is way way too much. Sometimes my array contains 5M+ hashes.
How do I speed up this process?
I have tried to use the parallel gem like this:
Parallel.each(orders, :in_threads => 3){ |order|
...
}
However, this didn't do anything for me.
Upvotes: 0
Views: 102
Reputation: 19544
Profile your code to see what the bottleneck is.
If I had to guess, Time.zone.parse
is probably where >=80% of the computation is going. Given a fixed date format, you could dramatically improve performance by constructing a date object manually, extracting components from substrings at particular ranges.
Upvotes: 2