Migos
Migos

Reputation: 148

Pandas - Set value in Column based on values in 3 other columns

I have a dump of transactions. Columns in the data set provide information regarding Currency and which FS each transaction flows to.

I want to translate currencies at two different rates depending on which FS the transaction flows to. There are two currencies USD and CAD. There are two FS. I have a column with all amounts in USD and one with all in CAD. See table below for an example.

FS  CUR USD     CAD    USD_FS

BS  USD 1000    1364    X
BS  USD 2000    2729    X
IS  CAD 300     409     X
IS  USD 55      75      X
BS  CAD 1312    1790    X
IS  CAD 3156    4306    X
IS  USD 32165   43881   X
BS  CAD 32156   43869   X

The psuedo code i want to implement in pandas is:

ye_rate = 1.3642
average_rate = 1.2957
if FS == 'BS' and CUR == 'CAD':
   USD_FS = CAD/ye_rate
else if FS == 'IS' and CUR == 'USD':
   USD_FS = CAD/average_rate
else:
   USD_FS = USD

This is what I have in pandas so far:

for i in range(0, len(df)):
    if df.loc[i]['Currency'] == 'CAD':
        if df.loc[i]['FS'] == 'BS':
            df.loc[i]['USD_FS'] = df.loc[i]['CAD']/ye_rate
        if df.loc[i]['FS'] == 'IS':
            df.loc[i]['USD_FS'] = df.loc[i]['CAD']/average_rate

I get this error:

A value is trying to be set on a copy of a slice from a DataFrame

For the sample table above, I want the following output:

FS  CUR USD     CAD     USD_FS

BS  USD 1000    1364    1000
BS  USD 2000    2729    2000
IS  CAD 300     409     409/average_rate
IS  USD 55      75      55
BS  CAD 1312    1790    1790/ye_rate
IS  CAD 3156    4306    4306/average_rate
IS  USD 32165   43881   32165
BS  CAD 32156   43869   43869/ye_rate

Upvotes: 2

Views: 176

Answers (2)

AlexK
AlexK

Reputation: 3011

If the desire is to continue relying on Pandas only (even though it's built on top of Numpy), the proper syntax for using the .loc indexer is:

df.loc[row_indexer,column_indexer]

Per Pandas' documentation:

This is the correct access method

In [305]: dfc = pd.DataFrame({'A':['aaa','bbb','ccc'],'B':[1,2,3]})

In [306]: dfc.loc[0,'A'] = 11

...

This will not work at all, and so should be avoided

dfc.loc[0]['A'] = 1111

Upvotes: 0

BENY
BENY

Reputation: 323326

You may need np.select

rate1=1
rate2=2
s1=(df.FS=='BS')&(df.CUR=='CAD')
s2=(df.FS=='IS')&(df.CUR=='USD')
np.select([s1,s2],[df.CAD*rate1,df.CAD*rate2],default = df.CAD)
#df.CAD=np.select([s1,s2],[df.CAD*rate1,df.CAD*rate2],default = df.CAD)

Out[280]: 
array([ 1364,  2729,   409,   150,  1790,  4306, 43881, 43869],
      dtype=int64)

Upvotes: 1

Related Questions