Reputation: 109
I have list of objects and i have created Dataframe, grouped object by usage_start_date and ploted stacked graph. I'm wondering in which way i can extract top 5 expensive services by checking sum of cost. So, for one date i can 10 services but i want to show in graph 5 most expensive services. Here is the current code:
dates = ["2022-02-13T13:43:22+00:00", "2022-02-14T13:43:22+00:001", "2022-02-15T13:43:22+00:00", "2022-02-16T13:43:22+00:00"]
service_name = ["Example service 1", "Example service 2", "Example service 3", "Example service 4", "Example service 5", "Example service 6", "Example service 7", "Example service 8", "Example service 9", 'Example 10']
data = []
for i in range(0,50):
tmp_data = {
"usage_start_date": random.choice(dates),
"cost": random.randrange(100),
"service_name": random.choice(service_name)
}
data.append(tmp_data)
df = pd.DataFrame(data)
df['usage_start_date'] = pd.to_datetime(df['usage_start_date'], utc=True).dt.tz_convert(None).dt.date
df['usage_start_date'] = pd.to_datetime(df['usage_start_date'], format='%Y-%m-%d')
grouped_data = df.groupby(['usage_start_date', 'service_name'], as_index=False, group_keys=True).sum() #.nlargest(n=5, columns=['cost'])
df1 = grouped_data.sort_values(by=['usage_start_date', 'cost'], ascending=[False, False]) #.nlargest(n=5, columns=['cost'])
df1.pivot(index="usage_start_date", columns="service_name", values="cost").plot(kind="bar", stacked=True, width=0.2)
plt.legend(title="Service names")
plt.show()
I tried to add nlargest(n=5, columns['cost']) but it's not working.
Upvotes: 1
Views: 2045
Reputation: 8219
You can replace the plotting line with
df1.sort_values('cost', ascending = False).groupby('usage_start_date').head(5).pivot(index="usage_start_date", columns="service_name", values="cost").plot(kind="bar", stacked=True, width=0.2)
where we first group byusage_start_date and take 5 largest cost services
The 5 most expensive services per date are different for different dates and I did not relabel them, so the legend mentions all 10 services. But there are only 5 points per date The graph looks like this
Upvotes: 1