Jin Mengfei
Jin Mengfei

Reputation: 333

How does loc in pandas dataframe work?

A newbee question, but i'm realy confused... Suppose there is dataframe like this:

>>>test = pd.DataFrame({'a':[1,1,0,0],'b':[1,1,0,0]})
>>>test
   a  b
0  1  1
1  1  1
2  0  0
3  0  0

Running the follwing code,

test.loc[2:] = 1

the data on the 3rd and 4th row would be set to 1,the dataframe would become:

   a  b
0  1  1
1  1  1
2  1  1
3  1  1

but if the subset is set a varable first,then dataframe would not change,

temp = test.loc[2:]
temp = 2 #nothing changed, just the temp variable set to 2

so what's the difference between these two code, why does the first one changed the dataframe?

Upvotes: 0

Views: 570

Answers (2)

vumaasha
vumaasha

Reputation: 2845

This has nothing to do with loc. This is a standard way how assignment works in python. A data frame is an object. loc is a method in the dataframe object. Consider the example below where a is created as a list and b is a reference to a, when you are changing b, you are only making the assignment of the variable b to point to something else. a remains the same, until you change the assignment to variable a.

#create a list
a = [1,2,3]

# add another reference to the list
b = a

b[1] = 5

b
Out[4]: [1, 5, 3]

# reassign the reference
b = 3

b
Out[6]: 3

a
Out[7]: [1, 5, 3]

Upvotes: 1

Harsh Vedant
Harsh Vedant

Reputation: 107

Python assigns everything by value. Every variable is a pointer to an object. You're dataframe is an object which is never copied unless you're explicitly copying it. So when you assign temp to a slice of your DataFrame, you're then resetting the value of temp to point to 2. Hence resetting the value. Try keeping it as functional/pythonic as possible.

Upvotes: 1

Related Questions