Reputation: 55
I have a df called high
that looks like this:
white black asian native NH_PI latin
0 10239 26907 1079 670 80 1101`
I'm trying to create a simple pie chart with matplotlib. I've looked at multiple examples and other SO pages like this one, but I keep getting this error:
Traceback (most recent call last):
File "I:\Sustainability & Resilience\Food Policy\Interns\Lara Haase\data_exploration.py", line 62, in <module>
plt.pie(sizes, explode=None, labels = high.columns, autopct='%1.1f%%', shadow=True, startangle=140)
File "C:\Python27\ArcGIS10.6\lib\site-packages\matplotlib\pyplot.py", line 3136, in pie
frame=frame, data=data)
File "C:\Python27\ArcGIS10.6\lib\site-packages\matplotlib\__init__.py", line 1819, in inner
return func(ax, *args, **kwargs)
File "C:\Python27\ArcGIS10.6\lib\site-packages\matplotlib\axes\_axes.py", line 2517, in pie
raise ValueError("'label' must be of length 'x'")
ValueError: 'label' must be of length 'x'`
I've tried multiple different ways to make sure the labels and values match up. There are 6 of each, but I can't understand why Python disagrees with me.
Here is one way I've tried:
plt.pie(high.values, explode=None, labels = high.columns, autopct='%1.1f%%', shadow=True, startangle=140)
And another way:
labels = list(high.columns)
sizes = list(high.values)
plt.pie(sizes, explode=None, labels = labels, autopct='%1.1f%%', shadow=True, startangle=140)`
Also have tried with .iloc
:
labels = list(high.columns)
sizes = high.loc[[0]]
print(labels)
print(sizes)
plt.pie(sizes, explode=None, labels = labels, autopct='%1.1f%%', shadow=True, startangle=140)
But no matter what I've tried, I keep getting that same key error. Any thoughts?
Upvotes: 1
Views: 1129
Reputation: 2740
Just to expand on @ScottBoston's post,
Plotting a pie chart from a data frame with one row is not possible unless you reshape the data into a single column or series.
An operation I typically use is .stack()
,
df = df.stack()
.stack()
is very similar to .T
, but returns a series with the column names as a second index level. This is handy when you have multiple rows and want to retain the original indexing. The result of df.stack()
is:
0 white 10239
black 26907
asian 1079
native 670
NH_PI 80
latin 1101
dtype: int64
After I stack()
a data frame, I typically assign a name to a series using:
df.name = 'Race'
Setting a name is not required, but helps when you are actually trying to plot the data using pd.DataFrame.plot.pie
.
If the data frame df
had more than one row of data, you could then plot pie charts for each row using .groupby
for name, group in df.groupby(level=0):
group.index = group.index.droplevel(0)
group.plot.pie(autopct='%1.1f%%', shadow=True, startangle=140)
Since the first level of the index only provides the positional index from the input data, I drop that level to make the labels on the plot appear as desired.
If you don't want to use pandas to make the pie chart, this worked for me:
plt.pie(df.squeeze().values, labels=df.columns.tolist(),autopct='%1.1f%%', shadow=True, startangle=140)
This attempt didn't work because high.columns
is not list-like.
#attempt 1
plt.pie(high.values, explode=None, labels = high.columns, autopct='%1.1f%%', shadow=True, startangle=140)
This attempt didn't work because list(high.values)
returns a list with an array as the first element.
#attempt 2
labels = list(high.columns)
sizes = list(high.values)
plt.pie(sizes, explode=None, labels = labels, autopct='%1.1f%%', shadow=True, startangle=140)
The last attempt didn't work because high.loc[[0]]
returns a dataframe. Matplotlib does not know parse a dataframe as an input.
labels = list(high.columns)
sizes = high.loc[[0]]
print(labels)
print(sizes)
plt.pie(sizes, explode=None, labels = labels, autopct='%1.1f%%', shadow=True, startangle=140)
Upvotes: 2
Reputation: 153510
You can try this, using pandas dataframe plot:
df.T.plot.pie(y=0, autopct='%1.1f%%', shadow=True, startangle=140, figsize=(10,8), legend=False)
Output:
Upvotes: 1