idclark
idclark

Reputation: 958

How to roll up events into metadata from original dataframe

I have data that looks like

Name,Report_ID,Amount,Flag,Actions
Fizz,123,5,,A
Fizz,123,10,Y,A
Buzz,456,10,,B
Buzz,456,40,,C
Buzz,456,70,,D
Bazz,678,100,Y,F

From these individual operations, i'd like to create a new dataframe that captures various statistics / meta name. Mostly summations and counts of items / counts of unique entries. I'd like the output of the dataframe to look like the following:

Report_ID,Number of Flags,Number of Entries, Total,Unique Actions
123,1,2,15,1
456,0,3,120,3
678,1,1,100,1

I've tried using groupby, but I cannot merge all of the individual groupby objects back together correctly. So far I've tried

totals = raw_data.groupby('Report_ID')['Amount'].sum()
event_count = raw_data.groupby('Report_ID').size()
num_actions = raw_data.groupby('Report_ID').Actions.nunique()

output = pd.concat([totals,event_count,num_actions])

When I try this i get TypeError: cannot concatenate a non-NDFrame object. Any help would be appreciated!

Upvotes: 0

Views: 53

Answers (2)

Alexander
Alexander

Reputation: 109546

You just need to specify axis=1 when concatenating:

event_count.name = 'Event Count'  # Name the Series, as you did not group on one.
>>> pd.concat([totals, event_count, num_actions], axis=1)

           Amount  Event Count  Actions
Report_ID                              
123            15            2        1
456           120            3        3
678           100            1        1  

Upvotes: 0

piRSquared
piRSquared

Reputation: 294298

You can use agg on the groupby

f = dict(Flag=['count', 'size'], Amount='sum', Actions='nunique')
df.groupby('Report_ID').agg(f)

           Flag      Amount Actions
          count size    sum nunique
Report_ID                          
123           1    2     15       1
456           0    3    120       3
678           1    1    100       1

Upvotes: 1

Related Questions