Cjoerg
Cjoerg

Reputation: 1325

How to speed up the process of a simple array iteration?

My array has 75,000 records and looks like this:

orders = [{ :order_id=>"7617",
            :date=>"2014-11-17 19:24:31",
            :firstname=>"Jack",
            :lastname=>"Bauer"},
          { :order_id=>"7618",
            :date=>"2014-11-17 19:34:51",
            :firstname=>"James",
            :lastname=>"Bond"},
            ... ]

I now need to loop through this array with the following code:

order_id_array = []
order_array    = []

orders.each do |order|
  prepared_order = prepare_order(order)
  order_id_array << prepared_order[0]
  order_array    << prepared_order[1]
end


def prepare_order(order)
  order_id = order[:order_id]

  [ order_id,
    { :order_id => order_id,
      :name => "#{order[:firstname]} #{order[:lastname]}",
      :date => Time.zone.parse(order[:date]),
      :customer_id => Moped::BSON::ObjectId.new } ]
end

This process take about 15 seconds. That is way way too much. Sometimes my array contains 5M+ hashes.

How do I speed up this process?

I have tried to use the parallel gem like this:

Parallel.each(orders, :in_threads => 3){ |order|
  ...
}

However, this didn't do anything for me.

Upvotes: 0

Views: 102

Answers (1)

mattt
mattt

Reputation: 19544

Profile your code to see what the bottleneck is.

If I had to guess, Time.zone.parse is probably where >=80% of the computation is going. Given a fixed date format, you could dramatically improve performance by constructing a date object manually, extracting components from substrings at particular ranges.

Upvotes: 2

Related Questions