Reputation: 333
A newbee question, but i'm realy confused... Suppose there is dataframe like this:
>>>test = pd.DataFrame({'a':[1,1,0,0],'b':[1,1,0,0]})
>>>test
a b
0 1 1
1 1 1
2 0 0
3 0 0
Running the follwing code,
test.loc[2:] = 1
the data on the 3rd and 4th row would be set to 1,the dataframe would become:
a b
0 1 1
1 1 1
2 1 1
3 1 1
but if the subset is set a varable first,then dataframe would not change,
temp = test.loc[2:]
temp = 2 #nothing changed, just the temp variable set to 2
so what's the difference between these two code, why does the first one changed the dataframe?
Upvotes: 0
Views: 570
Reputation: 2845
This has nothing to do with loc. This is a standard way how assignment works in python. A data frame is an object. loc
is a method in the dataframe object. Consider the example below where a
is created as a list and b
is a reference to a
, when you are changing b
, you are only making the assignment of the variable b
to point to something else. a
remains the same, until you change the assignment to variable a
.
#create a list
a = [1,2,3]
# add another reference to the list
b = a
b[1] = 5
b
Out[4]: [1, 5, 3]
# reassign the reference
b = 3
b
Out[6]: 3
a
Out[7]: [1, 5, 3]
Upvotes: 1
Reputation: 107
Python assigns everything by value. Every variable is a pointer to an object. You're dataframe is an object which is never copied unless you're explicitly copying it. So when you assign temp to a slice of your DataFrame, you're then resetting the value of temp to point to 2. Hence resetting the value. Try keeping it as functional/pythonic as possible.
Upvotes: 1