Reputation: 70003
I have the following dataframe df
:
period remaining_et_months property_id beds
0 0 0 329 1
1 1 0 329 1
2 2 1 329 1
3 3 2 329 1
4 3 2 329 1
5 4 3 329 1
6 4 3 329 1
7 4 3 329 1
8 5 4 329 1
9 5 4 329 1
10 5 4 329 1
11 5 4 329 1
and I need to group this dataframe and apply a function so that I am able to compute a column called dist_period
:
g = df.groupby(['property_id', 'beds', 'period'])
g.apply(some_function)
I need to group by property_id
, beds
, and period
, because the dataframe is larger than the one in this example.
I don't have a clear idea how to achieve this, but what I want the column dist_period
to be like this:
period remaining_et_months dist_period
0 0 0 0
1 1 0 1
2 2 1 1
3 3 2 1
4 3 2 2
5 4 3 1
6 4 3 2
7 4 3 3
8 5 4 1
9 5 4 2
10 5 4 3
11 5 4 4
Notice that when the group has just one element the value of dist_period
is period - remaining_et_months
, but when the group has more than one element (see when period is 3, 4 or 5), then a count starting at one is perform.
Upvotes: 0
Views: 76
Reputation: 496
Get the cumcount for the counting on multiple row groups (edit: note that this starts counting from 0):
df['dist_period'] = df.groupby(['property_id', 'beds', 'period']).cumcount()
Then you just add the difference between period and remaining_et_months:
df['dist_period'] = df['period'] - df['remaining_et_months'] + df['dist_period']
Upvotes: 1