Reputation: 47
I've got a df that looks like this with duplicate ID's
ID Usage_type
0 5 Note
1 6 Note
2 7 Service
3 5 Note
4 7 Note
5 10 Service
I want an extra two columns that indicate the cumulative count of usage_type for each ID like so:
ID Usage_type type_Note type_Service
0 5 Note 1 0
1 6 Note 1 0
2 7 Service 0 1
3 5 Note 2 0
4 7 Note 1 1
5 10 Service 0 1
I've used cumulative count to get the total count of Usage_type for each ID but want to break it down further into separate counts for each string.
Screenshot below shows what the current counts for an example ID
Upvotes: 1
Views: 146
Reputation: 359
I found an answer to your problem, and solved it by creating a custom function here's the bit of code that worked for me:
def truefalse(df):
count=0
df_2=pd.DataFrame({'col':[0]*(len(df))})
for i in range(len(df)):
if df[i] == True:
df_2['col'][i]=count+1
count+=1
return df_2
ID=[5,6,7,5,7,10]
usg=['Note','Note','Service','Note','Note','Service']
df=pd.DataFrame({'ID':ID,'Usage_type':usg})
for unid in df['ID'].unique():
df['type_Note'] = truefalse(((df['Usage_type'] =='Note') & ( df['ID'] ==unid)))['col']+df['type_Note']
df['type_Service'] = truefalse(((df['Usage_type'] =='Service) & ( df['ID'] ==unid)))['col']+df['type_Note']
Hope this helps!
Upvotes: 0
Reputation: 13821
You could filter your dataframe on the different Usage_type with loc
and perform a grouped cumulative count. Lastly, a grouped ffill()
will account for the cases you explained to me in the comments:
grp_cc = df.groupby([*df]).cumcount()+1
df.loc[df.Usage_type=='Note','type_Note'] = grp_cc
df.loc[df.Usage_type=='Service','type_Service'] = grp_cc
This will get you:
ID Usage_type type_Note type_Service
0 5 Note 1.0 NaN
1 6 Note 1.0 NaN
2 7 Service NaN 1.0
3 5 Note 2.0 NaN
4 7 Note 2.0 NaN
5 10 Service NaN 1.0
And then performing a grouped forward fill and filling in the null values get's you what you need:
df = df.groupby('ID').ffill().fillna(0)
>>> df
Usage_type type_Note type_Service
0 Note 1.0 0.0
1 Note 1.0 0.0
2 Service 0.0 1.0
3 Note 2.0 0.0
4 Note 1.0 1.0
5 Service 0.0 1.0
Upvotes: 2