jgtrz
jgtrz

Reputation: 375

Ranking and Aggregating by dates on a group in pandas df

I'm trying to create a new column that calculates the % completion per Project ID. I'm currently calculating the Project Week Num (group by Project ID) but I want to calculate the % completion, meaning (current [Project Week Num] based on ReportDate divided by total number of [Project Week Num]).

Here is the code on how I'm calculating current Project Week Num:

df['Project Week Num'] = df.groupby(['Project ID'])["ReportDate"].transform(lambda x: list(map(lambda y: dict(map(reversed, dict(enumerate(x.unique())).items()))[y]+1,x)))

For the example in the screenshot, this project has 106 total reports, so when

Project Week Num = 1, the [% Project Completition] would = 0.94%

Project Week Num = 2, the [% Project Completition] would = 1.88%

etc

enter image description here

Upvotes: 0

Views: 37

Answers (1)

jezrael
jezrael

Reputation: 863401

Use:

#if ReportDate are unique
df['Project Week Num'] = df.groupby('Project ID').cumcount()

s = df.groupby(['Project ID'])['Project ID'].transform('size')
df['%'] = df['Project Week Num'].div(s)

Upvotes: 0

Related Questions