Reputation: 446
I've been reading over this and still find the subject a little confusing : http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
Say I have a Pandas DataFrame and I wish to simultaneously set the first and last row elements of a single column to whatever value. I can do this :
df.iloc[[0, -1]].mycol = [1, 2]
which tells me A value is trying to be set on a copy of a slice from a DataFrame.
and that this is potentially dangerous.
I could use .loc
instead, but then I need to know the index of the first and last rows ( in constrast, .iloc
allows me to access by location ).
What's the safest Pandasy way to do this ?
To get to this point :
# Django queryset
query = market.stats_set.annotate(distance=F("end_date") - query_date)
# Generate a dataframe from this queryset, and order by distance
df = pd.DataFrame.from_records(query.values("distance", *fields), coerce_float=True)
df = df.sort_values("distance").reset_index(drop=True)
Then, I try calling df.distance.iloc[[0, -1]] = [1, 2]
. This raises the warning.
Upvotes: 1
Views: 3875
Reputation: 33793
The issue isn't with iloc
, it's when you access .mycol
that a copy is created. You can do this all within iloc
:
df.iloc[[0, -1], df.columns.get_loc('mycol')] = [1, 2]
Usually ix
is used if you want mixed integer and label based access, but doesn't work in this case since -1
isn't actually in the index, and apparently ix
isn't smart enough to know it should be the last index.
Upvotes: 1
Reputation: 394013
What you're doing is called chained indexing, you can use iloc
just on that column to avoid the warning:
In [24]:
df = pd.DataFrame(np.random.randn(5,3), columns=list('abc'))
Out[24]:
a b c
0 1.589940 0.735713 -1.158907
1 0.485653 0.044611 0.070907
2 1.123221 -0.862393 -0.807051
3 0.338653 -0.734169 -0.070471
4 0.344794 1.095861 -1.300339
In [25]:
df['a'].iloc[[0,-1]] ='foo'
df
Out[25]:
a b c
0 foo 0.735713 -1.158907
1 0.485653 0.044611 0.070907
2 1.12322 -0.862393 -0.807051
3 0.338653 -0.734169 -0.070471
4 foo 1.095861 -1.300339
If you do it the other way then it raises the warning:
In [27]:
df.iloc[[0,-1]]['a'] ='foo'
C:\WinPython-64bit-3.4.3.1\python-3.4.3.amd64\lib\site-packages\IPython\kernel\__main__.py:1: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
if __name__ == '__main__':
Upvotes: 1