Reputation: 149
I need to create a new column in a pandas DataFrame which is calculated as the ratio of 2 existing columns in the DataFrame. However, the denominator in the ratio calculation will change based on the value of a string which is found in another column in the DataFrame.
Example. Sample dataset :
import pandas as pd
df = pd.DataFrame(data={'hand' : ['left','left','both','both'],
'exp_force' : [25,28,82,84],
'left_max' : [38,38,38,38],
'both_max' : [90,90,90,90]})
I need to create a new DataFrame column df['ratio']
based on the condition of df['hand']
.
If df['hand']=='left'
then df['ratio'] = df['exp_force'] / df['left_max']
If df['hand']=='both'
then df['ratio'] = df['exp_force'] / df['both_max']
Upvotes: 2
Views: 4185
Reputation: 1564
You can use the apply()
method of your dataframe :
df['ratio'] = df.apply(
lambda x: x['exp_force'] / x['left_max'] if x['hand']=='left' else x['exp_force'] / x['both_max'],
axis=1
)
Upvotes: 2
Reputation: 16683
You can use np.where()
:
import pandas as pd
df = pd.DataFrame(data={'hand' : ['left','left','both','both'],
'exp_force' : [25,28,82,84],
'left_max' : [38,38,38,38],
'both_max' : [90,90,90,90]})
df['ratio'] = np.where((df['hand']=='left'), df['exp_force'] / df['left_max'], df['exp_force'] / df['both_max'])
df
Out[42]:
hand exp_force left_max both_max ratio
0 left 25 38 90 0.657895
1 left 28 38 90 0.736842
2 both 82 38 90 0.911111
3 both 84 38 90 0.933333
Alternatively, in a real-life scenario, if you have lots of conditions and results, then you can use np.select()
, so that you don't have to keep repeating your np.where()
statement as I have done a lot in my older code. It's better to use np.select
in these situations:
import pandas as pd
df = pd.DataFrame(data={'hand' : ['left','left','both','both'],
'exp_force' : [25,28,82,84],
'left_max' : [38,38,38,38],
'both_max' : [90,90,90,90]})
c1 = (df['hand']=='left')
c2 = (df['hand']=='both')
r1 = df['exp_force'] / df['left_max']
r2 = df['exp_force'] / df['both_max']
conditions = [c1,c2]
results = [r1,r2]
df['ratio'] = np.select(conditions,results)
df
Out[430]:
hand exp_force left_max both_max ratio
0 left 25 38 90 0.657895
1 left 28 38 90 0.736842
2 both 82 38 90 0.911111
3 both 84 38 90 0.933333
Upvotes: 3
Reputation: 8940
Enumerate
for i,e in enumerate(df['hand']):
if e == 'left':
df.at[i,'ratio'] = df.at[i,'exp_force'] / df.at[i,'left_max']
if e == 'both':
df.at[i,'ratio'] = df.at[i,'exp_force'] / df.at[i,'both_max']
df
Output:
hand exp_force left_max both_max ratio
0 left 25 38 90 0.657895
1 left 28 38 90 0.736842
2 both 82 38 90 0.911111
3 both 84 38 90 0.933333
Upvotes: 1