Reputation: 313
How one can plot the count
of the n
most frequent elements across all groups for a given multi group time series? Note this is different from n
most frequent elements of each group, which could be accomplished with count
and nlargest
.
Given a dataframe:
import pandas as pd
data = {'year': [2020, 2020, 2021, 2021, 2022],
'month': [1, 1, 2, 2, 3],
'Name': ['name_1', 'name_2', 'name_1', 'name_2', 'name_1'],
'count': [10, 12, 8, 10, 2]}
df = pd.DataFrame(data)
print(df)
which outputs
year month Name Count
0 2020 1 name_1 10
1 2020 1 name_2 12
2 2021 2 name_1 8
3 2021 2 name_2 10
4 2022 3 name_1 2
year
and month
n = 1
, in other words the most frequent oneI would like to plot only name_1
's count
since, although it does not have the largest count in any group (or even overall), it "appears" more times.
Upvotes: 1
Views: 69
Reputation: 260790
IIUC, you want to filter the most common Name and plot the counts?
# get top Name
top = df['Name'].value_counts().index[0]
# filter
df2 = df[df['Name'].eq(top)]
# plot
(df2.assign(date=df2[['year', 'month']].astype(str).apply('_'.join, axis=1))
.plot.bar(x='date', y='count')
)
# get top Name
top = df['Name'].value_counts().index[:2]
# filter and reshape
df2 = (df[df['Name'].isin(top)]
.pivot(index=['year', 'month'],
columns='Name',
values='count')
)
# plot
df2.plot.bar()
Upvotes: 1