Replacing a column with a function of itself in Pandas?

Question

I'm currently lost deep inside the pandas documentation. My problem is this:

I have a simple dataframe

col1  col2
 1     A
 4     B 
 5     X

My aim is to apply something like:

 df['col1'] = df['col1'].apply(square)

where square is a cleanly defined function. But this operation throws an error warning (and produces incorrect results)

A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead

I can't make sense of this nor the documentation it points to. My workflow is linear (in case this makes a wider range of solutions viable).

Pandas 0.17.1 and Python 2.7

All help much appreciated.

MaxU - stand with Ukraine · Accepted Answer

it works properly for me (pandas 0.18.1):

In [31]: def square(x):
   ....:     return x ** 2
   ....:

In [33]: df
Out[33]:
   col1 col2
0     1    A
1     4    B
2     5    X

In [35]: df.col1 = df.col1.apply(square)

In [36]: df
Out[36]:
   col1 col2
0     1    A
1    16    B
2    25    X

PS it also might depend on the implementation of your function...

Replacing a column with a function of itself in Pandas?

Answers (2)

Related Questions