Reputation: 3212
I was under the impression that sequences and strings always get deeply copied on assignment. Today I got burned when interfacing with a C library to which I pass unsafeAddr
of a Nim sequence. The C library writes into the memory area starting at the passed pointer.
Since I don't want the original Nim sequence to be changed by the library I thought I'll simply copy the sequence by assigning it to a new variable named copy
and pass the address of the copy to the library.
Lo and behold, the modifications showed up in the original Nim sequence nevertheless. What's even more weird is that this behavior depends on whether the copy is declared via let copy = ...
(changes do show up) or via var copy = ...
(changes do not show up).
The following code demonstrates this in a very simplified Nim example:
proc changeArgDespiteCopyAssignment(x: seq[int], val: int): seq[int] =
let copy = x
let copyPtr = unsafeAddr(copy[0])
copyPtr[] = val
result = copy
proc dontChangeArgWhenCopyIsDeclaredAsVar(x: seq[int], val: int): seq[int] =
var copy = x
let copyPtr = unsafeAddr(copy[0])
copyPtr[] = val
result = copy
let originalSeq = @[1, 2, 3]
var ret = changeArgDespiteCopyAssignment(originalSeq, 9999)
echo originalSeq
echo ret
ret = dontChangeArgWhenCopyIsDeclaredAsVar(originalSeq, 7777)
echo originalSeq
echo ret
This prints
@[9999, 2, 3]
@[9999, 2, 3]
@[9999, 2, 3]
@[7777, 2, 3]
So the first call changes originalSeq
while the second doesn't. Can someone explain what is going on under the hood?
I'm using Nim 1.6.6 and a total Nim newbie.
Upvotes: 4
Views: 773
Reputation: 3212
Turns out there are a lot of issues concerned with this behavior in the nim-lang issue tracker. For example:
Seq assignment does not perform a deep copy
Let behaves differently in proc for default gc
assigning var to local let does not properly copy in default gc
clarify spec/implementation for let: move or copy?
RFC give default gc same semantics as --gc:arc as much as possible
Long story short, whether a copy is made depends on a lot of factors, for sequences especially on the scope (global vs. local ) and the gc (refc, arc, orc) in use. More generally, the type involved (seq vs. array), the code generation backend (C vs. JS) and whatnot can also be relevant.
This behavior has tricked a lot of beginners and is not well received by some of the contributors. It doesn't happen with the newer GCs --gc:arc
or --gc:orc
where the latter is expected to become the default GC in an upcoming Nim version.
It has never been fixed in the current default gc because of performance concerns, backward compatibility risks and the expectation that it will disappear anyway once we transition to the newer GCs.
Personally, I would have expected that it at least gets clearly singled out in the Nim language manual. Well, it isn't.
Upvotes: 4
Reputation: 3212
I took a quick look at the generated C code, a highly edited and simplified version is below.
The essential thing that is missing in the let code = ...
variant is the call to genericSeqAssign()
which makes a copy of the argument and assigns that to copy
, instead the argument is assigned to copy
directly. So, there is definitely no deep copy on assignment in that case.
I don't know if that is intentional or if it is a bug in the code generation (my first impression was astonishment). Any ideas?
tySequence_A* changeArgDespiteCopyAssignment(tySequence_A* x, NI val) {
tySequence_A* result = NIM_NIL;
tySequence_A* copy;
NI* copyPtr;
copy = x; /* NO genericSeqAssign() call !*/
copyPtr = ©->data[(NI) 0];
*copyPtr = val;
genericSeqAssign(&result, copy, &NTIseqLintT_B);
return result;
}
tySequence_A* dontChangeArgWhenCopyIsDeclaredAsVar(tySequence_A* x, NI val) {
tySequence_A* result = NIM_NIL;
tySequence_A copy;
NI* copyPtr;
genericSeqAssign(©, x, &NTIseqLintT_B);
copyPtr = ©->data[(NI) 0];
*copyPtr = val;
genericSeqAssign(&result, copy, &NTIseqLintT_B);
return result;
}
Upvotes: 2