Reputation: 1659
I really don't understand the difference between shallow and deep copy. Ruby's #dup
seems to create a deep copy when I test it.
Documentation says:
Produces a shallow copy of obj---the instance variables of obj are
copied, but not the objects they reference.
But when I test this it seems to change the objects they reference.
class Klass
attr_accessor :name
end
a = Klass.new
a.name = "John"
b = a.dup
b.name = "Sue"
puts a.name # John
Why is shallow copy sufficient here when @name
is one of objects they reference
?
What's the simplest example where deep copy is needed?
Upvotes: 2
Views: 376
Reputation: 1604
Try this:
class Klass
attr_accessor :name
end
a = Klass.new
a.name = Klass.new #object inside object
a.name.name = 'George'
b = a.dup
puts b.name.name # George
b.name.name = 'Alex'
puts a.name.name # Alex
Also note that (see info):
When using dup, any modules that the object has been extended with will not be copied.
Edit: Note on Strings (this was interesting to find out) Strings are referenced not copied in the original scenario. This is proven through this case:
a.name = 'George'
puts a.name.object_id # 69918291262760
b = a.dup
puts b.name # George
puts b.name.object_id # 69918291262760
b.name.concat ' likes tomatoes' # append to existing string
puts b.name.object_id # 69918291262760
puts a.name # George likes tomatoes
This works as expected. Referenced objects (including strings) are not copied, and will share the reference.
So why does the original example appear not too? It is because when you set b.name to a something different you are setting it to a new string.
a.name = 'hello'
is really short hand for this:
a.name = String.new('hello')
Therefore in the original example, a.name & b.name are no longer referencing the same object, you can check the object_id to see.
Note that is not the case for Fixnum, floats, true, false or symbols. These objects are duplicated in a shallow copy.
Upvotes: 1
Reputation: 55833
The example you have shown does not describe the difference between a deep and a shallow copy. Instead, consider this example:
class Klass
attr_accessor :name
end
anna = Klass.new
anna.name = 'Anna'
anna_lisa = anna.dup
anna_lisa.name << ' Lisa'
# => "Anna Lisa"
anna.name
# => "Anna Lisa"
Generally, dup
and clone
are both expected to just duplicate the actual object you are calling the method on. No other referenced objects like the name
String in the above example are duplicated. Thus, after the duplication, both, the original and the duplicated object point to the very same name string.
With a deep_dup
, typically all (relevant) referenced objects are duplicated too, often to an infinite depth. Since this is rather hard to achieve for all possible object references, often people rely on implementation for specific objects like hashes and arrays.
A common workaround for a rather generic deep-dup is to use Ruby's Marshal class to serialize an object graph and directly unserializing it again.
anna_lena = Marshal.load( Marshal.dump(anna))
This creates new objects and is effectively a deep_dup. Since most objects support marshaling right away, this is a rather powerful mechanism. Note though than you should never unmarshal (i.e. load
) user-provided data since this will lead to a remote-code execution vulnerability.
Upvotes: 2