Reputation: 498
Lets say I have data like this:
df = pd.DataFrame({'category': ["blue","blue","blue", "blue","green"], 'val1': [5, 3, 2, 2, 5], 'val2':[1, 3, 2, 2, 5]})
print(df)
category val1 val2
0 blue 5 1
1 blue 3 3
2 blue 2 2
3 blue 2 2
4 green 5 5
I want filter by category, then select a column and a row-range, like this:
print(df.loc[df['category'] == 'blue'].loc[1:2, 'val1'])
1 3
2 2
Name: val1, dtype: int64
This works for selecting the data I am interested in, but when I try to overwrite part of my dataframe with the above-selected data, I get A value is trying to be set on a copy of a slice from a DataFrame
.
I am familiar with this error message and I know it occurs when trying to overwrite something with a dataframe that was selected like df.loc[columns].loc[rows]
instead of df.loc[columns, rows]
.
However, I can't figure out how to put all 3 things I am filtering for (a certain value for category, a certain column and a certain row range) into a single .loc[...]
. How can I select the part of the data in a way that I can use it to overwrite part of the dataframe?
Upvotes: 1
Views: 2026
Reputation: 403020
This makes sense because you are chaining two loc calls. My suggestion is to squash the two loc calls together. You can do this by filtering, then grabbing the index and to use in another loc
:
df.loc[df[df['category'].eq('blue')].index[1:3], 'val1'] = 123
Notice I have to use df.index[1:3]
instead of df.index[1:2]
because the end range is not inclusive for positional slicing (unlike loc
which is label-based slicing).
Upvotes: 2