Reputation: 129
I'm sorry but I'm bad with words so I'll just give example of what I was looking for.
ID date1 date2 date3 date4 date5 date6
001 0 5 10 15 5 40
002 0 20 50 0 10 15
003 5 15 5 30 10 0
If I have this dataframe of item prices, is there a way to sum it column by column and apply if condition and update the value in the dataframe? (that for example if sum < 10 then return same amount, if sum >= 10 then return 2x amount and if >= 50 then return 3x amount)
I want to get this dataframe as result:
ID date1 date2 date3 date4 date5 date6
001 0 5 20 30 10 120
002 0 40 150 0 30 45
003 5 30 10 90 30 0
in case of ID 000, date2 stays the same because it's < 10 and but date3 is 10x2 because the total amount is 5+10 which is > 10 therefore returns double the amount, same goes for date4 and date5 but it's x3 for date6.
Upvotes: 3
Views: 317
Reputation: 765
def function1(ss:pd.Series):
for i in range(1,len(ss)):
if (ss[i]+ss[i-1]>=50):
ss[i]*=3
elif (ss[i]+ss[i-1]>=10):
ss[i]*=2
return ss
df1.apply(function1,axis=1)
date1 date2 date3 date4 date5 date6
0 0 5 20 30 10 120
1 0 40 150 0 20 30
2 5 30 10 60 30 0
Upvotes: 0
Reputation: 13387
You can map values in a vectorized manner, casting bool
to int
:
x = df.iloc[:, 1:].cumsum(axis=1)
df.iloc[:, 1:] = (x.ge(50).astype(int) + x.ge(10).astype(int) + 1).mul(df.iloc[:, 1:])
So:
(x.ge(50).astype(int) + x.ge(10).astype(int) + 1)
Maps to 3 for cumulative sum greater than 50, to 2 when it's greater than 10, otherwise it's 1
Outputs:
ID date1 date2 date3 date4 date5 date6
0 001 0 5 20 30 10 120
1 002 0 40 150 0 30 45
2 003 5 30 10 90 30 0
Upvotes: 1
Reputation: 192
Using applymap
I might have misunderstood the question, this is if you just want to apply a calculation to every cell, not with awareness over the sum of every column.
You can define your own custom function, then apply it to each cell with pandas.DataFrame.applymap(function_name)
import pandas as pd
# DataFrame creation
cols = ['date1', 'date2', 'date3', 'date4', 'date5', 'date6']
index = ['001', '002', '003']
data = [[0, 5, 10, 15, 5, 40],
[0, 20, 50, 0, 10, 15],
[5, 15, 5, 30, 10, 0]]
df = pd.DataFrame(data)
df.columns = cols
df.index = index
df.index.name = 'ID'
# Define custom function
def custom_multiply(x):
if x < 10:
return x
elif x < 50:
return x * 2
elif x >= 50:
return x * 3
# Apply function to all cells and reassign to df-object
df = df.applymap(custom_multiply)
df
Upvotes: 0
Reputation: 71689
You can filter
the date
like columns then take the cumsum
on these columns along axis=1
, finally use np.select
with specified condition's and corresponding choices to get the final result:
s = df.filter(like='date')
cs = s.cumsum(1) # cummulative sum along axis=1
df[s.columns] = np.select([cs.ge(50), cs.ge(10)], [s*3, s*2], s)
ID date1 date2 date3 date4 date5 date6
0 001 0 5 20 30 10 120
1 002 0 40 150 0 30 45
2 003 5 30 10 90 30 0
Upvotes: 7