Robin
Robin

Reputation: 69

How can I delete keys by iterating over an array of hashes and by considering 2 hashes at a time?

I have an array of hashes. I want to iterate over it and would like to compare 2 hashes at a time and remove a key, value pair (from one hash) from the first hash when there is the same key in both hashes.

For example:

log = [{"sasha"=>"HELLO", "robin"=>"HI"},{"jack"=>"HI", "joey"=>"BYE", "robin"=>"THERE"}]

In the above array, there is one key that is common in both the hashes which is "robin". I would like to remove "robin"=>"THERE" from the second hash, while the first hash remains unchanged.

Also, the keys could be anything. they are not fixed key names

I have tried (h1.keys & h2.keys).each {|k| puts ( h1[k] == h2[k] ? h1[k] : k ) }, but I am trying to see if I can do it programmatically even if there are more than two hashes in the array.

Upvotes: 0

Views: 1114

Answers (3)

Kelvin
Kelvin

Reputation: 20857

It sounds like you're asking for pairs of hashes to be compared even if they aren't consecutive. Array#combination helps with this. You tell it how many elements you want to "consider" at a time (this number of elements are yielded to the block).

log = [
  {"sasha"=>"HELLO", "robin"=>"HI"},
  {"jack"=>"HI", "joey"=>"BYE", "robin"=>"THERE"},
  {"sasha"=>"GONE", "jane"=>"KEEP"},
  {"jack"=>"RM", "mary"=>"STAY"}
]

# yield all combinations of the elements, taken 2 at a time
log.combination(2) { |h1, h2|
  (h1.keys & h2.keys).each { |k|
    # delete from 2nd hash, but you can easily delete from the first if desired
    h2.delete(k)
  }
}

log
=> [{"sasha"=>"HELLO", "robin"=>"HI"}, {"jack"=>"HI", "joey"=>"BYE"}, {"jane"=>"KEEP"}, {"mary"=>"STAY"}]

See https://en.wikipedia.org/wiki/Combination for more info on the concept of combinations.

Upvotes: 0

Cary Swoveland
Cary Swoveland

Reputation: 110675

Suppose:

log = [
  {"sasha"=>"HELLO", "robin"=>"HI"},
  {"jack"=>"HI", "joey"=>"BYE", "robin"=>"THERE", "lois"=>"HUH" },
  {"trixie"=>"HO", "joey"=>"BYE", "robin"=>"THERE","bubba"=>"WHERE"},
  {"billie-bob"=>"YO", "bubba"=>"SHARE", "robin"=>"THERE"}
]

As I understand, for each pair of consecutive elements in log (hashes), each key of the first hash that is also a key of the second hash is to be removed from the first hash. (At the end I consider also removing keys in the last hash that are present in the first hash, as mentioned by the OP in a comment.)

There are many ways to do this. Here is one.

new_keys = (log.map(&:keys) << []).each_cons(2).map { |a1,a2| a1-a2 }
  #=> [["sasha"], ["jack", "lois"], ["trixie", "joey"], 
  #    ["billie-bob", "bubba", "robin"]]
arr = log.map!.with_index { |h,i| h.slice(*new_keys[i]) }
  #=> [{"sasha"=>"HELLO"}, {"jack"=>"HI", "lois"=>"HUH"},
  #    {"trixie"=>"HO", "joey"=>"BYE"},
  #    {"billie-bob"=>"YO", "bubba"=>"SHARE", "robin"=>"THERE"}]

We can confirm the original hashes were modified:

arr == log
  #=> true

If it were desired to return a new array of hashes and not mutate log, substitute Array#map for Array#map!.

See Enumerable#each_cons and Hash#slice.

The first operation, computing the new keys for each hash, involves the following steps.

  a = log.map(&:keys)
    #=> [["sasha", "robin"], ["jack", "joey", "robin", "lois"],
    #    ["trixie", "joey", "robin", "bubba"],
    #    ["billie-bob", "bubba", "robin"]] 
  b = a << []
    #=> [["sasha", "robin"], ["jack", "joey", "robin", "lois"],
    #    ["trixie", "joey", "robin", "bubba"],
    #    ["billie-bob", "bubba", "robin"], []] 
  enum = b.each_cons(2)
    #=> #<Enumerator: [["sasha", "robin"],...
    #     ["billie-bob", "bubba", "robin"], []]:each_cons(2)> 

We can see the elements that the enumerator enum will generate and pass to map by converting it to an array.

enum.to_a
  #=> [[["sasha", "robin"], ["jack", "joey", "robin", "lois"]],
  #    [["jack", "joey", "robin", "lois"],
  #     ["trixie", "joey", "robin", "bubba"]],
  #    [["trixie", "joey", "robin", "bubba"],
  #     ["billie-bob", "bubba", "robin"]],
  #    [["billie-bob", "bubba", "robin"], []]]

It is seen that each element generated is an array of two arrays holding the keys of two successive hashes in log. Continuing,

new_keys = enum.map { |a1,a2| a1-a2 }
  #=> [["sasha"], ["jack", "lois"], ["trixie", "joey"],
  #    ["billie-bob", "bubba", "robin"]] 

It should now be evident why we needed the last array generated by enum to contain the keys of the last hash of log and an empty array. That way the last hash in log is not modified.

It is not necessary to compute the new keys for each element of log as a first step, but doing so has the advantage that it avoids the need to extract the keys for each pair of contiguous hashes, which would require 2*(log.size) - 1 operations.

The second operation, using new_keys to modify the existing hashes, involves the following steps.

enum0 = log.map!
  #=> #<Enumerator: [{"sasha"=>"HELLO"},...
  #     "bubba"=>"SHARE", "robin"=>"THERE"}]:map!> 
enum1 = enum0.with_index
  #=> #<Enumerator: #<Enumerator: [{"sasha"=>"HELLO"},...
  #     "bubba"=>"SHARE", "robin"=>"THERE"}]:map!>:with_index> 

The first element is generated by enum1 and passed to the block (a two-element array), which is divided into it's two elements, which in turn are assigned to the two block variables. The process of doing this is called abbreviated assignment.

h,i = enum1.next
  #=> [{"sasha"=>"HELLO"}, 0] 
h #=> {"sasha"=>"HELLO"} 
i #=> 0

The block calculation is now performed.

a = new_keys[i]
  #=> ["sasha"] 
h.slice(*a)
  #=> {"sasha"=>"HELLO"} 

The first element of log is therefore replaced by {"sasha"=>"HELLO"}. Next,

h,i = enum1.next
  #=> [{"trixie"=>"HO", "joey"=>"BYE"}, 2] 
h #=> {"trixie"=>"HO", "joey"=>"BYE"} 
i #=> 2 
a = new_keys[i]
  #=> ["trixie", "joey"] 
h.slice(*a)
  #=> {"trixie"=>"HO", "joey"=>"BYE"} 

The remaining steps are similar.

If it were desired to also remove keys from that last hash in log that were also present in the first hash in log, the first step would be changed as follows.

new_keys = (log.map(&:keys) << log.first.keys).
             each_cons(2).map { |a1,a2| a1-a2 }
  #=> [["sasha"], ["jack", "lois"], ["trixie", "joey"],
  #    ["billie-bob", "bubba"]] 

as

log.map(&:keys) << log.first.keys
  #=> [["sasha", "robin"], ["jack", "joey", "robin", "lois"],
  #    ["trixie", "joey", "robin", "bubba"],
  #    ["billie-bob", "bubba", "robin"], ["sasha", "robin"]] 

Upvotes: 1

Todd A. Jacobs
Todd A. Jacobs

Reputation: 84343

Iterate Through Consecutive Pairs and Delete Matching Keys from Second Element

If you want to compare two arrays at a time from an array of hashes, you can compare each element of the array to the previous one using Enumerable#each_cons. Then, for each pair of consecutive hashes, simply delete the matching keys from the second hash using Hash#delete_if.

log = [
  {"sasha"=>"HELLO", "robin"=>"HI"},
  {"jack"=>"HI", "joey"=>"BYE", "robin"=>"THERE"},
]

log.each_cons(2) do |hash1, hash2|
  hash2.delete_if { |k,v| hash1.has_key? k }
end

log
#=> [{"sasha"=>"HELLO", "robin"=>"HI"}, {"jack"=>"HI", "joey"=>"BYE"}]

Upvotes: 1

Related Questions