Disclaimer This post is about the correct usage of the terms "shallow-copy" and "deep-copy", specifically when talking about copying an object which does not contain any references . This question is not meant to be (and should not be) opinion-based, unless there truly is no consensus regarding this topic. I have tagged this question as C, but it might be language-agnostic, unless the meaning of those terms in that context is well-defined for specific languages but not for others. Preface The terms "shallow-copy" and "deep-copy" are commonly used when copying an object with references, in order to specify whether or not the copy is complete (independent of the original). However, I have also seen this terminology used when copying an object without references, where both terms mean the exact same thing and there would be no need to differentiate. So far, I have not found a concise definition which would cover this particular use of those terms. The definitions given on Stack Overflow (in the tags shallow-copy and deep-copy ): A shallow copy contains a link (address in memory) to the original variable. Changes on shallow copies are reflected on origin object. A deep copy duplicates the object or variable being pointed to so that the destination (the object being assigned to) receives its own local copy. Under these definitions, a copy of an object without references would be a deep-copy. The definitions given on Wikipedia (in the article Object copying ): One method of copying an object is the shallow copy. In that case a new object B is created, and the fields values of A are copied over to B. This is also known as a field-by-field copy, field-for-field copy, or field copy. If the field value is a reference to an object (e.g., a memory address) it copies the reference, hence referring to the same object as A does, and if the field value is a primitive type it copies the value of the primitive type. In languages without primitive types (where everything is an object), all fields of the copy B are references to the same objects as the fields of original A. The referenced objects are thus shared, so if one of these objects is modified (from A or B), the change is visible in the other. Shallow copies are simple and typically cheap, as they can be usually implemented by simply copying the bits exactly. An alternative is a deep copy, meaning that fields are dereferenced: rather than references to objects being copied, new copy objects are created for any referenced objects, and references to these placed in B. The result is different from the result a shallow copy gives in that the objects referenced by the copy B are distinct from those referenced by A, and independent. Deep copies are more expensive, due to needing to create additional objects, and can be substantially more complicated, due to references possibly forming a complicated graph. Under these definitions, a copy of an object without references would be a shallow-copy. I think both terms are inappropriate, because "shallow-copy" implies that the copy is incomplete, whereas "deep-copy" implies that some kind of special treatment (or high cost) is required for copying. Since copying an object without references is both complete and yet does not require any special treatment, I would argue that neither of those terms should be used. However, this post is not about what I think, but what is the current consensus (if any) in the programming community. Questions When I copy an object without references, would that be considered a shallow-copy (because no references are involved)? a deep-copy (because the target object is independent from the source object)? both? neither? Is there a good term for a partial deep-copy, where some fields are shallow-copied and others deep-copied?

Reputation: 690

Does the shallow/deep-copy terminology apply for objects without references?

Disclaimer

This post is about the correct usage of the terms "shallow-copy" and "deep-copy", specifically when talking about copying an object which does not contain any references. This question is not meant to be (and should not be) opinion-based, unless there truly is no consensus regarding this topic. I have tagged this question as C, but it might be language-agnostic, unless the meaning of those terms in that context is well-defined for specific languages but not for others.

Preface

The terms "shallow-copy" and "deep-copy" are commonly used when copying an object with references, in order to specify whether or not the copy is complete (independent of the original).

However, I have also seen this terminology used when copying an object without references, where both terms mean the exact same thing and there would be no need to differentiate. So far, I have not found a concise definition which would cover this particular use of those terms.

The definitions given on Stack Overflow (in the tags shallow-copy and deep-copy):

A shallow copy contains a link (address in memory) to the original variable. Changes on shallow copies are reflected on origin object.

A deep copy duplicates the object or variable being pointed to so that the destination (the object being assigned to) receives its own local copy.

Under these definitions, a copy of an object without references would be a deep-copy.
The definitions given on Wikipedia (in the article Object copying):

One method of copying an object is the shallow copy. In that case a new object B is created, and the fields values of A are copied over to B. This is also known as a field-by-field copy, field-for-field copy, or field copy. If the field value is a reference to an object (e.g., a memory address) it copies the reference, hence referring to the same object as A does, and if the field value is a primitive type it copies the value of the primitive type. In languages without primitive types (where everything is an object), all fields of the copy B are references to the same objects as the fields of original A. The referenced objects are thus shared, so if one of these objects is modified (from A or B), the change is visible in the other. Shallow copies are simple and typically cheap, as they can be usually implemented by simply copying the bits exactly.

An alternative is a deep copy, meaning that fields are dereferenced: rather than references to objects being copied, new copy objects are created for any referenced objects, and references to these placed in B. The result is different from the result a shallow copy gives in that the objects referenced by the copy B are distinct from those referenced by A, and independent. Deep copies are more expensive, due to needing to create additional objects, and can be substantially more complicated, due to references possibly forming a complicated graph.

Under these definitions, a copy of an object without references would be a shallow-copy.

I think both terms are inappropriate, because "shallow-copy" implies that the copy is incomplete, whereas "deep-copy" implies that some kind of special treatment (or high cost) is required for copying. Since copying an object without references is both complete and yet does not require any special treatment, I would argue that neither of those terms should be used. However, this post is not about what I think, but what is the current consensus (if any) in the programming community.

Questions

When I copy an object without references, would that be considered

a shallow-copy (because no references are involved)?
a deep-copy (because the target object is independent from the source object)?
both?
neither?

Is there a good term for a partial deep-copy, where some fields are shallow-copied and others deep-copied?

Upvotes: 1

Answers (2)

Géry Ogam

Reputation: 8047

The paper Copying and Comparing: Problems and Solutions published by Peter Grogono and Markku Sakkinen in 2000 is a good reference for your questions.

Various copying operations can be applied to a source expression and a target expression:

assignment (also known as aliasing), which binds the target expression to the location of the source expression;
replacement (also known as mutation), which copies the contents of the source expression into the location of the target expression;
cloning, which binds the target expression to a new location and copies the contents of the source expression into that new location, i.e. which performs an allocation followed by a replacement.

In the following diagrams, the arrows represent bindings, the boxes represent locations, X, Y and Z represent names, A, A′, B and B′ represent values, • represent references, the first function parameter represents the target expression and the second function parameter represents the source expression.

Replacement and cloning can be further categorized by their depth:

shallow operation, which copies values and references;
deep operation, which copies values and performs deep operations on references.

The distinction between shallow and deep operations does not apply to assignment. Shallow cloning and deep cloning are often called shallow copy and deep copy respectively.

Since there is an infinite number of depth, there is actually an infinite number of replacement and cloning operations besides the shallow and deep ones.

We can define replace-k, a replacement of depth k, as follows:

replace-0(X, Y) performs assign(X, Y);
replace-k(X, Y) for k > 0 copies the values of Y into the location of X and performs replace-(k − 1) from the references of Y into the location of X.

We can define clone-k, a cloning of depth k, as follows:

clone-0(X, Y) performs assign(X, Y);
clone-k(X, Y) for k > 0 binds X to a new location, copies the values of Y into that new location and performs clone-(k − 1) from the references of Y into that new location.

Languages that provide cloning operations usually provide only clone-1 (shallow copy) and clone-∞ (deep copy).

Now that we have provided the definitions, let us address your questions.

When I copy an object without references, would that be considered

a shallow-copy (because no references are involved)?

a deep-copy (because the target object is independent from the source object)?

both?

neither?

It depends on who is considering the clone-k₀ operation with k₀ ≥ 1 that has been applied to the source object:

If it is considered by the caller, he already knows which operation he has applied to the source object, so the solution is: {clone-k₀}.
If it is considered by someone else, he has to guess which operation the caller could have applied to the source object only by comparing the structures of the source object and target object, so the solution is: {clone-1, clone-2, …, clone-∞}.

Is there a good term for a partial deep-copy, where some fields are shallow-copied and others deep-copied?

Not to my knowledge, but this kind of copy is often more useful because it is semantic, whereas shallow copy and deep copy are syntactic. So I would call it a semantic copy, as hinted by the paper:

The shallow and deep operations are not generally useful. In most cases, “shallow” is too shallow and “deep” is too deep. In order to be generally applicable, copying operations should respect the semantic properties of objects rather than merely their syntactic properties.

Upvotes: 3

David Schwartz

Reputation: 182829

When the distinction doesn't apply, just call it a "copy". It's not a shallow copy because there are no shared references and it's not a deep copy because nothing but the values in the structure are copied.

This question is like asking if rocks are atheists. Sure, they aren't theists. But does the theist/atheist distinction really apply to them? Some scales are only designed for measuring certain things.

Upvotes: 5

Does the shallow/deep-copy terminology apply for objects without references?

Disclaimer

Preface

Questions

Answers (2)

Related Questions