Reputation: 2962
Assume I have a pandas.DataFrame, say
In [1]: df = pd.DataFrame([['a', 'x'], ['b', 'y'], ['c', 'z']],
index=[10, 20, 30],
columns=['first', 'second'])
In [2]: df
Out[2]:
first second
10 a x
20 b y
30 c z
and I want to update the first two entries of the first column with the corresponding entries of the second column. First I tried
to_change = df.index <= 20
df[to_change]['first'] = df[to_change]['second']
but this does not work. However,
df['first'][to_change] = df['second'][to_change]
works fine.
Can anyone explain? What is the rational behind this behavior? Although I use pandas a lot I find these kind of issues make it sometimes hard to predict what a particular piece of pandas code will actually do. Maybe someone can provide an insight that helps me to improve my mental model of the inner workings of pandas.
Upvotes: 2
Views: 607
Reputation: 129018
In master/0.13 (releasing very shortly)
This will now warn (controllable by an option to raise/ignore) that you are modifying a copy
In [1]: df = pd.DataFrame([['a', 'x'], ['b', 'y'], ['c', 'z']],
...: index=[10, 20, 30],
...: columns=['first', 'second'])
In [2]: df
Out[2]:
first second
10 a x
20 b y
30 c z
In [3]: to_change = df.index <= 20
In [4]: df[to_change]['first'] = df[to_change]['second']
pandas/core/generic.py:1008: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_index,col_indexer] = value instead
warnings.warn(t,SettingWithCopyWarning)
In [5]: df['first'][to_change] = df['second'][to_change]
Upvotes: 2