Reputation: 148
I have a dump of transactions. Columns in the data set provide information regarding Currency and which FS each transaction flows to.
I want to translate currencies at two different rates depending on which FS the transaction flows to. There are two currencies USD and CAD. There are two FS. I have a column with all amounts in USD and one with all in CAD. See table below for an example.
FS CUR USD CAD USD_FS
BS USD 1000 1364 X
BS USD 2000 2729 X
IS CAD 300 409 X
IS USD 55 75 X
BS CAD 1312 1790 X
IS CAD 3156 4306 X
IS USD 32165 43881 X
BS CAD 32156 43869 X
The psuedo code i want to implement in pandas is:
ye_rate = 1.3642
average_rate = 1.2957
if FS == 'BS' and CUR == 'CAD':
USD_FS = CAD/ye_rate
else if FS == 'IS' and CUR == 'USD':
USD_FS = CAD/average_rate
else:
USD_FS = USD
This is what I have in pandas so far:
for i in range(0, len(df)):
if df.loc[i]['Currency'] == 'CAD':
if df.loc[i]['FS'] == 'BS':
df.loc[i]['USD_FS'] = df.loc[i]['CAD']/ye_rate
if df.loc[i]['FS'] == 'IS':
df.loc[i]['USD_FS'] = df.loc[i]['CAD']/average_rate
I get this error:
A value is trying to be set on a copy of a slice from a DataFrame
For the sample table above, I want the following output:
FS CUR USD CAD USD_FS
BS USD 1000 1364 1000
BS USD 2000 2729 2000
IS CAD 300 409 409/average_rate
IS USD 55 75 55
BS CAD 1312 1790 1790/ye_rate
IS CAD 3156 4306 4306/average_rate
IS USD 32165 43881 32165
BS CAD 32156 43869 43869/ye_rate
Upvotes: 2
Views: 176
Reputation: 3011
If the desire is to continue relying on Pandas only (even though it's built on top of Numpy), the proper syntax for using the .loc indexer is:
df.loc[row_indexer,column_indexer]
Per Pandas' documentation:
This is the correct access method
In [305]: dfc = pd.DataFrame({'A':['aaa','bbb','ccc'],'B':[1,2,3]})
In [306]: dfc.loc[0,'A'] = 11
...
This will not work at all, and so should be avoided
dfc.loc[0]['A'] = 1111
Upvotes: 0
Reputation: 323326
You may need np.select
rate1=1
rate2=2
s1=(df.FS=='BS')&(df.CUR=='CAD')
s2=(df.FS=='IS')&(df.CUR=='USD')
np.select([s1,s2],[df.CAD*rate1,df.CAD*rate2],default = df.CAD)
#df.CAD=np.select([s1,s2],[df.CAD*rate1,df.CAD*rate2],default = df.CAD)
Out[280]:
array([ 1364, 2729, 409, 150, 1790, 4306, 43881, 43869],
dtype=int64)
Upvotes: 1