Shawn
Shawn

Reputation: 603

Scale one value based on another in Pandas

This question is similar to this one However, when I adapt that solution I receive the following error:

    ValueError: cannot reindex from a duplicate axis

I'm trying to do something like this:

    import pandas as pd

    cols = {'foo': ['A','A','Z','A','Z'], 'bar' : [1,1,1,1,1]}
    df = pd.DataFrame(data=cols)

    df

       bar foo
    0   1   A
    1   1   A
    2   1   Z
    3   1   A
    4   1   Z

    df[df['foo'] == 'Z']['bar'] = df[df['foo'] == 'Z']['bar'] * 100


    C:\Anaconda3\envs\Scikit\lib\site-packages\ipykernel_launcher.py:1: SettingWithCopyWarning: 

A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy """Entry point for launching an IPython kernel.

Upvotes: 1

Views: 345

Answers (1)

jezrael
jezrael

Reputation: 863651

Use DataFrame.loc for select column by condition:

df.loc[df['foo'] == 'Z', 'bar'] *=  100
#same like
#df.loc[df['foo'] == 'Z', 'bar'] =  df.loc[df['foo'] == 'Z', 'bar'] * 100
print (df)
  foo  bar
0   A    1
1   A    1
2   Z  100
3   A    1
4   Z  100

Upvotes: 2

Related Questions