Reputation: 3274
Note: Please note that the code below is essentially non-sense, and just for illustration purposes.
Based on the fact that the right-hand side of an assignment must always be evaluated before it's value is assigned to the left-hand side variable, and that increment operations such as ++
and --
are always performed right after evaluation, I would not expect the following code to work:
string[] newArray1 = new[] {"1", "2", "3", "4"};
string[] newArray2 = new string[4];
int IndTmp = 0;
foreach (string TmpString in newArray1)
{
newArray2[IndTmp] = newArray1[IndTmp++];
}
Rather, I would expect newArray1[0]
to be assigned to newArray2[1]
, newArray1[1]
to newArray[2]
and so on up to the point of throwing a System.IndexOutOfBoundsException
. Instead, and to my great surprise, the version that throws the exception is
string[] newArray1 = new[] {"1", "2", "3", "4"};
string[] newArray2 = new string[4];
int IndTmp = 0;
foreach (string TmpString in newArray1)
{
newArray2[IndTmp++] = newArray1[IndTmp];
}
Since, in my understanding, the compiler first evaluates the RHS, assigns it to the LHS and only then increments this is to me an unexpected behaviour. Or is it really expected and I am clearly missing something?
Upvotes: 24
Views: 1352
Reputation: 660088
It is instructive to see exactly where your error is:
the right-hand side of an assignment must always be evaluated before it's value is assigned to the left-hand side variable
Correct. Clearly the side effect of the assignment cannot happen until after the value being assigned has been computed.
increment operations such as ++ and -- are always performed right after evaluation
Almost correct. It is not clear what you mean by "evaluation" -- evaluation of what? The original value, the incremented value, or the value of the expression? The easiest way to think about it is that the original value is computed, then the incremented value, then the side effect happens. Then the final value is that one of the original or the incremented value is chosen, depending on whether the operator was prefix or postfix. But your basic premise is pretty good: that the side effect of the increment happens immediately after the final value is determined, and then the final value is produced.
You then seem to be concluding a falsehood from these two correct premises, namely, that the side effects of the left hand side are produced after the evaluation of the right hand side. But nothing in those two premises implies this conclusion! You've just pulled that conclusion out of thin air.
It would be more clear if you stated a third correct premise:
the storage location associated with the left-hand-side variable also must be known before the assignment takes place.
Clearly this is true. You need to know two things before an assignment can happen: what value is being assigned, and what memory location is being mutated. You can't figure those two things out at the same time; you have to figure out one of them first, and we figure out the one on the left hand side -- the variable -- first in C#. If figuring out where the storage is located causes a side effect then that side effect is produced before we figure out the second thing -- the value being assigned to the variable.
In short, in C# the order of evaluations in an assignment to a variable goes like this:
Upvotes: 12
Reputation: 391336
This is well-defined in the C# language according to Eric Lippert and is easily explained.
Note: The actual execution of code might not be like this, the important thing to remember is that the compiler must create code that is equivalent to this
So what happens in the second piece of code is this:
newArray2
is evaluated and the result is remembered (ie. the reference to whatever array we want to store things in is remembered, in case side-effects later change it)IndTemp
is evaluated and the result is rememberedIndTemp
is increased by 1newArray1
is evaluated and the result is rememberedIndTemp
is evaluated and the result is remembered (but this is 1 here)As you can see, the second time IndTemp
is evaluated (RHS), the value has already been increased by 1, but this has no impact on the LHS since it is remembering that the value was 0 before increased.
In the first piece of code, the order is slightly different:
newArray2
is evaluated and the result is rememberedIndTemp
is evaluated and the result is rememberednewArray1
is evaluated and the result is rememberedIndTemp
is evaluated and the result is remembered (but this is 1 here)IndTemp
is increased by 1In this case, the increase of the variable at step 2.3 has no impact on the current loop iteration, and thus you will always copy from index N
into index N
, whereas in the second piece of code you will always copy from index N+1
into index N
.
Eric has a blog entry titled Precedence vs order, redux that should be read.
Here is a piece of code that illustrates, I basically turned variables into properties of a class, and implemented a custom "array" collection, that all just dump to the console what is happening.
void Main()
{
Console.WriteLine("first piece of code:");
Context c = new Context();
c.newArray2[c.IndTemp] = c.newArray1[c.IndTemp++];
Console.WriteLine();
Console.WriteLine("second piece of code:");
c = new Context();
c.newArray2[c.IndTemp++] = c.newArray1[c.IndTemp];
}
class Context
{
private Collection _newArray1 = new Collection("newArray1");
private Collection _newArray2 = new Collection("newArray2");
private int _IndTemp;
public Collection newArray1
{
get
{
Console.WriteLine(" reading newArray1");
return _newArray1;
}
}
public Collection newArray2
{
get
{
Console.WriteLine(" reading newArray2");
return _newArray2;
}
}
public int IndTemp
{
get
{
Console.WriteLine(" reading IndTemp (=" + _IndTemp + ")");
return _IndTemp;
}
set
{
Console.WriteLine(" setting IndTemp to " + value);
_IndTemp = value;
}
}
}
class Collection
{
private string _name;
public Collection(string name)
{
_name = name;
}
public int this[int index]
{
get
{
Console.WriteLine(" reading " + _name + "[" + index + "]");
return 0;
}
set
{
Console.WriteLine(" writing " + _name + "[" + index + "]");
}
}
}
Output is:
first piece of code:
reading newArray2
reading IndTemp (=0)
reading newArray1
reading IndTemp (=0)
setting IndTemp to 1
reading newArray1[0]
writing newArray2[0]
second piece of code:
reading newArray2
reading IndTemp (=0)
setting IndTemp to 1
reading newArray1
reading IndTemp (=1)
reading newArray1[1]
writing newArray2[0]
Upvotes: 18
Reputation: 13091
ILDasm can be your best friend, sometimes ;-)
I compiled up both your methods and compared the resulting IL (assembly language).
The important detail is in the loop, unsurprisingly. Your first method compiles and runs like this:
Code Description Stack
ldloc.1 Load ref to newArray2 newArray2
ldloc.2 Load value of IndTmp newArray2,0
ldloc.0 Load ref to newArray1 newArray2,0,newArray1
ldloc.2 Load value of IndTmp newArray2,0,newArray1,0
dup Duplicate top of stack newArray2,0,newArray1,0,0
ldc.i4.1 Load 1 newArray2,0,newArray1,0,0,1
add Add top 2 values on stack newArray2,0,newArray1,0,1
stloc.2 Update IndTmp newArray2,0,newArray1,0 <-- IndTmp is 1
ldelem.ref Load array element newArray2,0,"1"
stelem.ref Store array element <empty>
<-- newArray2[0] = "1"
This is repeated for each element in newArray1. The important point is that the location of the element in the source array has been pushed to the stack before IndTmp is incremented.
Compare this to the second method:
Code Description Stack
ldloc.1 Load ref to newArray2 newArray2
ldloc.2 Load value of IndTmp newArray2,0
dup Duplicate top of stack newArray2,0,0
ldc.i4.1 Load 1 newArray2,0,0,1
add Add top 2 values on stack newArray2,0,1
stloc.2 Update IndTmp newArray2,0 <-- IndTmp is 1
ldloc.0 Load ref to newArray1 newArray2,0,newArray1
ldloc.2 Load value of IndTmp newArray2,0,newArray1,1
ldelem.ref Load array element newArray2,0,"2"
stelem.ref Store array element <empty>
<-- newArray2[0] = "2"
Here, IndTmp is incremented before the location of the element in the source array has been pushed to the stack, hence the difference in behaviour (and the subsequent exception).
For completeness, let's compare it with
newArray2[IndTmp] = newArray1[++IndTmp];
Code Description Stack
ldloc.1 Load ref to newArray2 newArray2
ldloc.2 Load IndTmp newArray2,0
ldloc.0 Load ref to newArray1 newArray2,0,newArray1
ldloc.2 Load IndTmp newArray2,0,newArray1,0
ldc.i4.1 Load 1 newArray2,0,newArray1,0,1
add Add top 2 values on stack newArray2,0,newArray1,1
dup Duplicate top stack entry newArray2,0,newArray1,1,1
stloc.2 Update IndTmp newArray2,0,newArray1,1 <-- IndTmp is 1
ldelem.ref Load array element newArray2,0,"2"
stelem.ref Store array element <empty>
<-- newArray2[0] = "2"
Here, the result of the increment has been pushed to the stack (and becomes the array index) before IndTmp is updated.
In summary, it seems to be that the target of the assignment is evaluated first, followed by the source.
Thumbs up to the OP for a really thought provoking question!
Upvotes: 21
Reputation: 849
Obviously the assumption that the rhs is always evaluated before the lhs is wrong. If you look here http://msdn.microsoft.com/en-us/library/aa691315(v=VS.71).aspx it seems like in the case of indexer access the arguments of the indexer access expression, which is the lhs, are evaluated before the rhs.
in other words, first it is determined where to store the result of the rhs, only then the rhs is evaluated.
Upvotes: 4
Reputation: 124642
It throws an exception because you start indexing into newArray1
at index 1. Since you are iterating over each element in newArray1
the last assignment throws an exception because IndTmp
is equal to newArray1.Length
, i.e., one past the end of the array. You increment the index variable before it is ever used to extract an element from newArray1
, which means you will crash and also miss the first element in newArray1
.
Upvotes: 3
Reputation: 93030
newArray2[IndTmp] = newArray1[IndTmp++];
leads to first assinging and then incrementing the variable.
and so on.
The RHS ++ operator increments right away, but it returns the value before it was incremented. The value used to index in the array is the value returned by the RHS ++ operator, so the non incremented value.
What you describe (the exception thrown) will be a result of a LHS ++:
newArray2[IndTmp] = newArray1[++IndTmp]; //throws exception
Upvotes: 13