user2128702
user2128702

Reputation: 2121

How to remove duplicates from array with custom objects

When I call first_array | second_array on two arrays that contain custom objects:

first_array = [co1, co2, co3]
second_array =[co2, co3, co4]

it returns [co1, co2, co3, co2, co3, co4]. It doesn't remove the duplicates. I tried to call uniq on the result, but it didn't work either. What should I do?

Update:

This is the custom object:

class Task
    attr_accessor :status, :description, :priority, :tags
    def initiate_task task_line
        @status = task_line.split("|")[0]
        @description = task_line.split("|")[1]
        @priority = task_line.split("|")[2]
        @tags = task_line.split("|")[3].split(",")
        return self
    end

    def <=>(another_task)
        stat_comp = (@status == another_task.status)
        desc_comp = (@description == another_task.description)
        prio_comp = (@priority == another_task.priority)
        tags_comp = (@tags == another_task.tags)
        if(stat_comp&desc_comp&prio_comp&tags_comp) then return 0 end
    end
end

and when I create few instances of Task type and drop them into two different arrays and when I try to call '|' on them nothing happens it just returns array including both first and second array's elements without the duplicates removed.

Upvotes: 6

Views: 3252

Answers (6)

leandroico
leandroico

Reputation: 1227

I tried the solution from fsaravia above and it didn't work out for me. I tried in both Ruby 2.3.1 and Ruby 2.4.0.

The solution I've found is very similar to what fsaravia posted though, with a small tweak. So here it is:

class A
  attr_accessor :name

  def initialize(name)
    @name = name
  end

  def eql?(other)
    hash.eql?(other.hash)
  end

  def hash
    name.hash
  end
end

a = A.new('Peter')
b = A.new('Peter')

arr = [a,b]
puts arr.uniq

Please, don't mind that I've removed the @ in my example. It won't affect the solution per se. It's just that, IMO, there wasn't any reason to access the instance variable directly, given a reader method was set for that reason.

So...what I really changed is found inside the eql? method, where I used hash instead name. That's it!

Upvotes: 1

Jo P
Jo P

Reputation: 1676

The uniq method can take a block that defines what to compare the objects on. For example:

class Task
  attr_accessor :n
  def initialize(n)
    @n = n
  end
end

t1 = Task.new(1)
t2 = Task.new(2)
t3 = Task.new(2)

a = [t1, t2, t3]

a.uniq
#=> [t1, t2, t3] # because all 3 objects are unique

a.uniq { |t| t.n }
#=> [t1, t2]     # as it's comparing on the value of n in the object

Upvotes: 4

fsaravia
fsaravia

Reputation: 735

No programming language for itself can be aware if two objects are different if you don't implement the correct equality methods. In the case of ruby you need to implement eql? and hash in your class definition, as these are the methods that the Array class uses to check for equality as stated on Ruby's Array docs:

def eql?(other_obj)
  # Your comparing code goes here
end

def hash
  #Generates an unique integer based on instance variables
end

For example:

class A

  attr_accessor :name

  def initialize(name)
    @name = name
  end

  def eql?(other)
    @name.eql?(other.name)
  end

  def hash
    @name.hash
  end
end

a = A.new('Peter')
b = A.new('Peter')

arr = [a,b]
puts arr.uniq

Removes b from Array leaving only one object

Hope this helps!

Upvotes: 5

hirolau
hirolau

Reputation: 13901

I took the liberty to rewrite your class and add the methods that needs to be overwritten in order to use uniq (hash and eql?).

class Task

    METHODS = [:status, :description, :priority, :tags]
    attr_accessor *METHODS

    def initialize task_line
        @status, @description, @priority, @tags = *task_line.split("|")
        @tags = @tags.split(",")
    end

    def eql? another_task
       METHODS.all?{|m| self.send(m)==another_task.send(m)}
    end

    alias_method :==, :eql? #Strictly not needed for array.uniq

    def hash
      [@status, @description, @priority, @tags].hash
    end

end


x = [Task.new('1|2|3|4'), Task.new('1|2|3|4')]
p x.size #=> 2
p x.uniq.size #=> 1

Upvotes: 0

Cary Swoveland
Cary Swoveland

Reputation: 110675

Regarding your 'update', is this what you are doing:

a = Task.new # => #<Task:0x007f8d988f1b78> 
b = Task.new # => #<Task:0x007f8d992ea300> 
c = [a,b]    # => [#<Task:0x007f8d988f1b78>, #<Task:0x007f8d992ea300>] 
a = Task.new # => #<Task:0x007f8d992d3e48> 
d = [a]      # => [#<Task:0x007f8d992d3e48>]  
e = c|d      # => [#<Task:0x007f8d988f1b78>, #<Task:0x007f8d992ea300>, \
                   #<Task:0x007f8d992d3e48>] 

and then suggesting that e = [a, b, a]? If so, that's the problem, because a no longer points to #<Task:0x007f8d988f1b78>. All you can say is e => [#<Task:0x007f8d988f1b78>, b, a]

Upvotes: 0

jbr
jbr

Reputation: 6258

If you look at the Array#| operator it says that it uses the eql?-method, which on Object is the same as the == method. You can define that by mixin in the Comparable-module, and then implement the <=>-method, then you'll get lots of comparison-methods for free.

The <=> operator is very easy to implement:

def <=>(obj)
    return -1 if this < obj
    return 0 if this == obj
    return 1 if this > obj
end

Upvotes: 0

Related Questions