Adam Smith
Adam Smith

Reputation: 111

Merge Ruby arrays

I have a few arrays of Ruby objects of class UserInfo:

class UserInfo  
    attr_accessor :name, :title, :age
end

How can I merge these arrays into one array? A user is identified by its name, so I want no duplicate names. If name, title, age, etc. are equal I'd like to have 1 entry in the new array. If names are the same, but any of the other details differ I probably want those 2 users in a different array to manually fix the errors.

Thanks in advance

Upvotes: 1

Views: 2451

Answers (3)

edgerunner
edgerunner

Reputation: 14973

Redefine equality comparison on your object, and you can get rid of actual duplicates quickly with Array#uniq

class UserInfo
  attr_accessor :name, :title, :age

  def == other
    name==other.name and title==other.title and age==other.age
  end
end

# assuming a and b are arrays of UserInfo objects
c = a | b
# c will only contain one of each UserInfo

Then you can sort by name and look for name-only duplicates

d = c.sort{ |p,q| p.name <=> q.name } #sort by name
name = ""
e = []
d.each do |item|
  if item.name == name
    e[-1] = [e[-1],item].flatten 
  else
    e << item
  end
end

Upvotes: 1

Paul Rubel
Paul Rubel

Reputation: 27222

Here's another potential way. If you have a way of identifying each UserInfo, say a to_str method that prints out the values:

  def to_str()
    return "#{@name}:#{@title}:#{@age}"
  end

You can use inject and a hash

all_users = a + b # collection of users to "merge"    
res = all_users.inject({})do |h,v|
  h[v.to_str] = v  #save the value indexed on the string output
  h # return h for the next iteration
end

merged = res.values #the unique users

Upvotes: 0

Jonas Elfstr&#246;m
Jonas Elfstr&#246;m

Reputation: 31428

A year ago I monkey patched a kind of cryptic instance_variables_compare on Object. I guess you could use that.

class Object
  def instance_variables_compare(o)
    Hash[*self.instance_variables.map {|v|
      self.instance_variable_get(v)!=o.instance_variable_get(v) ? 
      [v,o.instance_variable_get(v)] : []}.flatten]
  end
end

A cheesy example

require 'Date'

class Cheese
  attr_accessor :name, :weight, :expire_date
  def initialize(name, weight, expire_date)
    @name, @weight, @expire_date = name, weight, expire_date
  end
end

stilton=Cheese.new('Stilton', 250, Date.parse("2010-12-02"))
gorgonzola=Cheese.new('Gorgonzola', 250, Date.parse("2010-12-17"))

irb is my weapon of choice

>> stilton.instance_variables_compare(gorgonzola)
=> {"@name"=>"Gorgonzola", "@expire_date"=>#<Date: 4910305/2,0,2299161>}
>> gorgonzola.instance_variables_compare(stilton)
=> {"@name"=>"Stilton", "@expire_date"=>#<Date: 4910275/2,0,2299161>}
>> stilton.expire_date=gorgonzola.expire_date
=> #<Date: 4910305/2,0,2299161>
>> stilton.instance_variables_compare(gorgonzola)
=> {"@name"=>"Gorgonzola"}
>> stilton.instance_variables_compare(stilton)
=> {}

As you can see the instance_variables_compare returns an empty Hash if the two objects has the same content.

An array of cheese

stilton2=Cheese.new('Stilton', 210, Date.parse("2010-12-02"))
gorgonzola2=Cheese.new('Gorgonzola', 250, Date.parse("2010-12-17"))

arr=[]<<stilton<<stilton2<<gorgonzola<<gorgonzola2

One hash without problems and one with

h={}
problems=Hash.new([])

arr.each {|c| 
  if h.has_key?(c.name)
    if problems.has_key?(c.name)
      problems[c.name]=problems[c.name]<<c
    elsif h[c.name].instance_variables_compare(c) != {}
      problems[c.name]=problems[c.name]<<c<<h[c.name]
      h.delete(c.name)
    end
  else 
    h[c.name]=c
  end
}

Now the Hash h contains the objects without merging problems and the problems hash contains those that has instance variables that differs.

>> h
=> {"Gorgonzola"=>#<Cheese:0xb375e8 @name="Gorgonzola", @weight=250, @expire_date=#<Date: 2010-12-17 (4911095/2,0,2299161)>>}

>> problems
=> {"Stilton"=>[#<Cheese:0xf54c30 @name="Stilton", @weight=210, @expire_date=#<Date: 2010-12-02 (4911065/2,0,2299161)>>, #<Cheese:0xfdeca8 @name="Stilton", @weight=250,@expire_date=#<Date: 2010-12-02 (4911065/2,0,2299161)>>]}    

As far as I can see you will not have to modify this code at all to support an array of UserInfo objects.

It would most probably be much faster to compare the properties directly or with a override of ==. This is how you override ==

def ==(other)
  return self.weight == other.weight && self.expire_date == other.expire_date
end

and the loop changes into this

arr.each {|c| 
  if h.has_key?(c.name)
    if problems.has_key?(c.name)
      problems[c.name]=problems[c.name]<<c
    elsif h[c.name] != c
      problems[c.name]=problems[c.name]<<c<<h[c.name]
      h.delete(c.name)
    end
  else 
    h[c.name]=c
  end
}

Finally you might want to convert the Hash back to an Array

result = h.values

Upvotes: 0

Related Questions