Reputation: 469
I have the data in dis following format
import pandas as pd
import matplotlib.pyplot as plt
Metric Country Year Value
0 2G Austria 2018 1049522
1 2G Austria 2019 740746
2 2G Austria 2020 508452
3 2G Austria 2021 343667
4 2G Austria 2022 234456
65 3G Austria 2018 2133823
66 3G Austria 2019 1406927
67 3G Austria 2020 1164042
68 3G Austria 2021 1043169
69 3G Austria 2022 920025
130 4G Austria 2018 7482733
131 4G Austria 2019 8551865
132 4G Austria 2020 8982975
133 4G Austria 2021 9090997
134 4G Austria 2022 8905121
195 5G Austria 2018 0
196 5G Austria 2019 0
197 5G Austria 2020 41995
198 5G Austria 2021 188848
199 5G Austria 2022 553826
I am trying to create an "Area" chart based on the values per year, split by the metrics.
For that, I create a pivot table for agregating the results, as follows:
pivot_austria = pd.pivot_table(data_austria, index=['Metric'],
columns=['Year'],
values=['Value'],
aggfunc=sum,
fill_value=0)
Which returns the data in this format:
Value
Year 2018 2019 2020 2021 2022
Metric
2G 1049522 740746 508452 343667 234456
3G 2133823 1406927 1164042 1043169 920025
4G 7482733 8551865 8982975 9090997 8905121
5G 0 0 41995 188848 553826
But when I try the plot command:
plot = plt.stackplot(pivot_austria.columns, pivot_austria.values, labels = pivot_austria.index)
I get an error
return np.array(data, dtype=np.unicode)
ValueError: setting an array element with a sequence
I tried many things of plotting this, with and without pivot, and it didnt work so far, anyone know what I could be doing wrong?
Upvotes: 2
Views: 1018
Reputation: 153560
I am not sure which kind of plot you trying to generate, but removing the backets around the value will help.
Let's try this first:
pivot_austria = pd.pivot_table(data_austria, index=['Metric'],
columns=['Year'],
values='Value',
aggfunc=sum,
fill_value=0)
plt.stackplot(pivot_austria.columns, pivot_austria.values, labels = pivot_austria.index)
ax = plt.gca()
ax.set_xticks(pivot_austria.columns)
Output:
Or as @pask suggest in his solution let pandas handle it:
ax = pivot_austria.plot.area()
ax.set_xticks(pivot_austria.index)
Output:
EDIT to display as percentages:
ax = (pivot_austria / pivot_austria.sum(1).max()).plot.area()
ax.set_xticks(pivot_austria.index)
ax.set_yticklabels(['{:,.2%}'.format(x) for x in ax.get_yticks()])
ax.set_ylim(0,1)
Output:
Upvotes: 5
Reputation: 927
Pandas already includes an easy way to plot area plots
Try:
pivot_austria.T.plot.area(xticks=pivot_austria.T.index)
Upvotes: 2