Reputation: 447
I have a line of code which produces a tabled count from a groupby
command of two columns:
beat_pri = df_2011.groupby(['Beat', 'Priority']).size()
table produced:
Beat Priority Count
01X 1.0 394
2.0 1816
02X 1.0 644
2.0 1970
02Y 1.0 661
...
What I am trying to do from this is create a new column 'Total'
which is the total count grouped by the beat
column. For e.g.
Beat Priority Count Total
01X 1.0 394 2210
2.0 1816
02X 1.0 644 2614
2.0 1970
02Y 1.0 661 2970
2.0 2309
...
What I have tried so far is:
beat_pri2011['Total'] = df_2011.groupby(['Beat']).size().to_frame('total')
However this produces:
Beat Priority Count Total
01X 1.0 394 NaN
2.0 1816 NaN
02X 1.0 644 NaN
2.0 1970 NaN
02Y 1.0 661 NaN
2.0 2309 NaN
...
Thanks in advance.
(I have re-typed this question to clear up any confusion it brought earlier on.)
Upvotes: 2
Views: 106
Reputation: 5461
I mis-understood your question earlier. What I understand from you recent edit is you want to sort the aggregates after groupby by total count.
Is the following what you wanted
df = pd.DataFrame([('ST&SAN P...', '1.0', '06X', '1.0', 'PDOA', 'POSSIBLE DEAD PERSON'), ('ST&HANNAH ST', '1.0', '07X', '1.0', '415GS', '415 GUNSHOTS'), ('ST&MARKET ST', '1.0', '07X', '2.0', '415GS', '415 GUNSHOTS'), ('PRENTISS ST', '2.0', '06X', '2.0', '415GS', '415 GUNSHOTS'), ('AV&FOOTH...', '2.0', '07X', '1.0', '415GS', '415 GUNSHOTS')], columns=('Location', 'Area-Id', 'Beat', 'Priority', 'Type-Id', 'Incident Type Description'))
df2 = df.groupby(["Beat", "Priority"])[["Location"]].count() \
.rename(dict(Location="Count"), axis=1).reset_index()
df2.merge(df2.groupby("Beat").sum(), on="Beat", suffixes=("", "_Total")) \
.sort_values("Count_Total", ascending=False)
Result
Beat Priority Count Count_Total
2 07X 1.0 2 3
3 07X 2.0 1 3
0 06X 1.0 1 2
1 06X 2.0 1 2
Upvotes: 2