Reputation: 35
I have two data frames, trying to use entries from df1
to limit amounts in df2
, then add them up. It seems like my code is limiting right, but not adding the amounts up.
Code:
import pandas as pd
df1 = pd.DataFrame({'Caps':['25','45','65']})
df2 = pd.DataFrame({'Amounts':['45','25','65','35','85']})
df1['Capped'] = df1.apply(lambda row: df2['Amounts'].where(
df2['Amounts'] <= row['Caps'], row['Caps']).sum(), axis=1)
Output:
>>> df1
Caps Capped
0 25 2525252525
1 45 4525453545
2 65 4525653565
Upvotes: 2
Views: 155
Reputation: 862641
First is necessary convert values to integers by Series.astype
:
df1['Caps'] = df1['Caps'].astype(int)
df2['Amounts'] = df2['Amounts'].astype(int)
df1['Capped'] = df1.apply(lambda row: df2['Amounts'].where(
df2['Amounts'] <= row['Caps'], row['Caps']).sum(), axis=1)
print (df1)
Caps Capped
0 25 125
1 45 195
2 65 235
For improve performance is possible use numpy.where
with broadcasting:
df1['Caps'] = df1['Caps'].astype(int)
df2['Amounts'] = df2['Amounts'].astype(int)
am = df2['Amounts'].to_numpy()
ca = df1['Caps'].to_numpy()
#pandas below 0.24
#am = df2['Amounts'].values
#ca = df1['Caps'].values
df1['Capped'] = np.where(am <= ca[:, None], am[None, :], ca[:, None]).sum(axis=1)
print (df1)
Caps Capped
0 25 125
1 45 195
2 65 235
Upvotes: 1