Marko Avlijaš
Marko Avlijaš

Reputation: 1659

Provide simplest example where deep copy is needed in ruby

I really don't understand the difference between shallow and deep copy. Ruby's #dup seems to create a deep copy when I test it.

Documentation says:

Produces a shallow copy of obj---the instance variables of obj are
copied, but not the objects they reference.

But when I test this it seems to change the objects they reference.

class Klass
  attr_accessor :name
end

a = Klass.new
a.name = "John"
b = a.dup
b.name = "Sue"
puts a.name # John

Why is shallow copy sufficient here when @name is one of objects they reference?
What's the simplest example where deep copy is needed?

Upvotes: 2

Views: 376

Answers (2)

ABrowne
ABrowne

Reputation: 1604

Try this:

class Klass
  attr_accessor :name
end

a = Klass.new
a.name = Klass.new #object inside object
a.name.name = 'George'
b = a.dup
puts b.name.name # George

b.name.name = 'Alex'
puts a.name.name # Alex

Also note that (see info):

When using dup, any modules that the object has been extended with will not be copied.

Edit: Note on Strings (this was interesting to find out) Strings are referenced not copied in the original scenario. This is proven through this case:

a.name = 'George'
puts a.name.object_id # 69918291262760    

b = a.dup
puts b.name # George
puts b.name.object_id # 69918291262760  

b.name.concat ' likes tomatoes' # append to existing string
puts b.name.object_id # 69918291262760  

puts a.name # George likes tomatoes

This works as expected. Referenced objects (including strings) are not copied, and will share the reference.

So why does the original example appear not too? It is because when you set b.name to a something different you are setting it to a new string.

   a.name = 'hello' 

is really short hand for this:

   a.name = String.new('hello')

Therefore in the original example, a.name & b.name are no longer referencing the same object, you can check the object_id to see.

Note that is not the case for Fixnum, floats, true, false or symbols. These objects are duplicated in a shallow copy.

Upvotes: 1

Holger Just
Holger Just

Reputation: 55833

The example you have shown does not describe the difference between a deep and a shallow copy. Instead, consider this example:

class Klass
  attr_accessor :name
end

anna = Klass.new
anna.name = 'Anna'

anna_lisa = anna.dup
anna_lisa.name << ' Lisa'
# => "Anna Lisa"

anna.name
# => "Anna Lisa"

Generally, dup and clone are both expected to just duplicate the actual object you are calling the method on. No other referenced objects like the name String in the above example are duplicated. Thus, after the duplication, both, the original and the duplicated object point to the very same name string.

With a deep_dup, typically all (relevant) referenced objects are duplicated too, often to an infinite depth. Since this is rather hard to achieve for all possible object references, often people rely on implementation for specific objects like hashes and arrays.

A common workaround for a rather generic deep-dup is to use Ruby's Marshal class to serialize an object graph and directly unserializing it again.

anna_lena = Marshal.load( Marshal.dump(anna))

This creates new objects and is effectively a deep_dup. Since most objects support marshaling right away, this is a rather powerful mechanism. Note though than you should never unmarshal (i.e. load) user-provided data since this will lead to a remote-code execution vulnerability.

Upvotes: 2

Related Questions