Reputation:
df = pd.DataFrame([['2018-02-03',42],
['2018-02-03',22],
['2018-02-03',10],
['2018-02-03',32],
['2018-02-03',10],
['2018-02-04',8],
['2018-02-04',2],
['2018-02-04',12],
['2018-02-03',20],
['2018-02-05',30],
['2018-02-05',5],
['2018-02-05',15]])
df.columns = ['product','date','quantity']
I want to create groups by date and calculate the minimum value of a 'quantity' column for all the groups respectively and subtract the value from all the values of a 'quantity' column of that group. The desired output is:
day value
2018-02-03 32 #(because, 42-10 = 32), 10 is minimum for date 2018-02-03.
2018-02-03 12
2018-02-03 0
2018-02-03 22
2018-02-03 0
2018-02-04 6
2018-02-04 0
2018-02-04 10
2018-02-03 10
2018-02-05 25
2018-02-05 0
2018-02-05 10
Now, this is what I tried:
df = df.groupby('Date', as_index = True)
datamin = df.groupby('Date')['quantity'].min()
Bu this creates a dataframe with the first quantity by Date ana I also do not know, how to proceed after this!!
Upvotes: 1
Views: 75
Reputation: 862601
For improve performance use GroupBy.transform
without lambda function, better is subtract all values of column like:
df['value'] = df['quantity'].sub(df.groupby('date')['quantity'].transform('min'))
Upvotes: 0
Reputation: 24314
try via groupby()
and transform()
:
df['value']=df.groupby('date')['quantity'].transform(lambda x:x-x.min())
output of df
:
date quantity value
0 2018-02-03 42 32
1 2018-02-03 22 12
2 2018-02-03 10 0
3 2018-02-03 32 22
4 2018-02-03 10 0
5 2018-02-04 8 6
6 2018-02-04 2 0
7 2018-02-04 12 10
8 2018-02-03 20 10
9 2018-02-05 30 25
10 2018-02-05 5 0
11 2018-02-05 15 10
Upvotes: 2