JohnMerlino
JohnMerlino

Reputation: 3928

ruby: can't add a new key into hash during iteration

I have three hashes:

db_headers = {"1"=>"first_name", "2"=>"last_name"}
csv_headers = {"1"=>"First Name", "2"=>"Last Name"}
csv_records = {"0"=>{"id"=>"11", "first_name"=>"first_0", "Last Name"=>"last_0", "created_at"=>"2014-08-12 17:02:28 UTC", "updated_at"=>"2014-08-12 17:02:28 UTC"}, "1"=>{"id"=>"12", "first_name"=>"first_1", "Last Name"=>"last_1", "created_at"=>"2014-08-12 17:02:28 UTC", "updated_at"=>"2014-08-12 17:02:28 UTC"}}

db_headers and csv_headers are matched by their keys. For example, their key "2" values contain "last_name" and "Last Name" respectively. My goal is wherever the values are different between db_headers and csv_headers where the keys are the same, then I need to swap the key in csv_records with value of db_headers. So for example, csv_records key will change from "Last Name" to "last_name", since db_headers and csv_headers value at key "2" were different.

This is what I came up with:

  csv_records.each do |record_key,record_value|
    csv_headers.each do |csv_key,csv_value|
      if record_value.has_key? csv_value
          db_headers.each do |db_key, db_value|
            if csv_key == db_key
              csv_records[db_value] = csv_records.delete csv_value
              break
            end
          end
          break
      end
    end
  end

Unfortunately it fails:

RuntimeError: can't add a new key into hash during iteration
    from (irb):12:in `[]='
    from (irb):12:in `block (3 levels) in irb_binding'
    from (irb):10:in `each'
    from (irb):10:in `block (2 levels) in irb_binding'
    from (irb):8:in `each'
    from (irb):8:in `block in irb_binding'
    from (irb):7:in `each'
    from (irb):7

This made the error go away:

csv_records.keys.each do |record_key|
    csv_headers.keys.each do |csv_key|
      if csv_records[record_key].has_key? csv_headers[csv_key]
          db_headers.keys.each do |db_key|
            if csv_key == db_key
              csv_records[db_headers[db_key]] = csv_records.delete csv_headers[csv_key]
              # break is needed becasue csv_key wont exist in next iteration
              break
            end
          end
      end
    end
  end

But csv_records is supposed to now have a value last_name, but it continues to have "Last Name" instead.

Upvotes: 2

Views: 6259

Answers (6)

just add this row into you code, you have to clone your hash at first.

csv_records = csv_records.clone

This row will fix exactly only your problem with ruby: can't add a new key into hash during iteration

but it seams like your code not finished (if you wanted to do it on your way). Your code with my fixes (added one row)

csv_records.each do |record_key,record_value|
  csv_headers.each do |csv_key,csv_value|
    if record_value.has_key? csv_value
      db_headers.each do |db_key, db_value|
        if csv_key == db_key
          csv_records = csv_records.clone
          csv_records[db_value] = csv_records.delete csv_value
          break
        end
      end
      break
    end
  end
end

Upvotes: 0

Jaugar Chang
Jaugar Chang

Reputation: 3196

First get the replacing rules from db_headers and csv_headers

map = Hash[db_headers.merge(csv_headers){|_,v1,v2| [v2,v1]}.values]
#=> {"First Name"=>"first_name", "Last Name"=>"last_name"}

Then with tap to transfer data in csv_records :

csv_records.tap {|x| 
  map.each {|from,to| 
    x.each{|k,v| 
      x[k][to]=x[k][from] if x[k][from]
      x[k].delete(from) 
    }
  }
}
#=> {"0"=>{"id"=>"11", "first_name"=>"first_0", "created_at"=>"2014-08-12 17:02:28 UTC", "updated_at"=>"2014-08-12 17:02:28 UTC", "last_name"=>"last_0"}, "1"=>{"id"=>"12", "first_name"=>"first_1", "created_at"=>"2014-08-12 17:02:28 UTC", "updated_at"=>"2014-08-12 17:02:28 UTC", "last_name"=>"last_1"}}

Shorten in one line:

csv_records.tap {|x| map.each {|from,to| x.each{|k,v| x[k][to]=x[k].delete(from) if x[k][from] }}}

UPDATE

For your logic, the problem is:

  • There's no need to iterate db_headers in cvs_headers's iteration, get the target db_hearder directly from the db_hearders hash.
  • csv_records.delete csv_headers[csv_key] will do nothing, because csv_records only have keys of "0" and "1", you should use csv_records[record_key].delete csv_headers[csv_key].

Try this:

csv_records.keys.each do |record_key|
  csv_headers.keys.each do |csv_key|
    if csv_records[record_key].has_key? csv_headers[csv_key]
      csv_records[record_key][db_headers[csv_key]] = csv_records[record_key].delete csv_headers[csv_key] if csv_records[record_key][csv_headers[csv_key]]
    end
  end
end

csv_records

Upvotes: 0

Cary Swoveland
Cary Swoveland

Reputation: 110675

I suggest:

  • select the keys that are in all three hashes
  • of those keys, select the ones that have different values in db_headers and csv_headers
  • for those keys, swap the values in db_headers and csv_headers

Converting this approach to code is straightforward:

(csv_records.keys & db_headers.keys & csv_headers.keys).select { |k|
  db_headers[k] != csv_headers[k] }.each { |k|
    db_headers[k], csv_headers[k] = csv_headers[k], db_headers[k] }

db_headers  #=> {"1"=>"First Name", "2"=>"last_name"}
csv_headers #=> {"1"=>"first_name", "2"=>"Last Name"}

We have

keys = csv_records.keys & db_headers.keys & csv_headers.keys
  #=> ["1"]

selected_keys = keys.select { |k| db_headers[k] != csv_headers[k] }
  #=> ["1"]

Then perform a parallel assignment for each of these keys (here just one):

selected_keys.each { |k|
  db_headers[k], csv_headers[k] = csv_headers[k], db_headers[k] }

Upvotes: 0

Jay Mitchell
Jay Mitchell

Reputation: 1240

Unless you have major memory constraints, use reduce to build up your desired records.

# If you need to keep csv_headers and db_headers for another reason, you can use them to create REPLACE_KEYS.
REPLACE_KEYS = {"First Name"=>"first_name", "Last Name"=>"last_name"}
csv_records = {"0"=>{"id"=>"11", "first_name"=>"first_0", "Last Name"=>"last_0", "created_at"=>"2014-08-12 17:02:28 UTC", "updated_at"=>"2014-08-12 17:02:28 UTC"}, "1"=>{"id"=>"12", "first_name"=>"first_1", "Last Name"=>"last_1", "created_at"=>"2014-08-12 17:02:28 UTC", "updated_at"=>"2014-08-12 17:02:28 UTC"}}    

def transform_record(record)
  record.reduce({}) do |acc, (key, value)|
    new_key = REPLACE_KEYS[key] || key
    acc[new_key] = value
    acc
  end
end

db_records = csv_records.reduce({}) do |acc, (row, record)|
  acc[row] = transform_record(record)
  acc
end

Upvotes: 2

ClaytonC
ClaytonC

Reputation: 475

There is a bug in your code. You are trying to modify the wrong hash (which is easy to do when you have nested hashes). So:

csv_records[db_value] = csv_records.delete csv_value

should instead be:

record_value[db_value] = record_value.delete csv_value

That alone should fix your problem.

Also, as a further tip, it seems that you could combine the csv_headers and db_headers hashes into one hash:

{ "First Name" => "first_name",
"Last Name" => "last_name" }

Which should allow you to simplify the logic of your loop.

Upvotes: 0

Amadan
Amadan

Reputation: 198324

Iterate on hash.keys instead of on hash. #keys will make an array that is separate from the hash, so you won't be messing up the iteration as you modify the hash.

Upvotes: 7

Related Questions