Reputation: 3
What is the best way to produce pie charts for unique values in a Dataframe?
I have a DataFrame that shows number of services by county. I would like to produce a group of pie charts for every county that shows the count of services in that county. I've tried a variety of different approaches without much success.
Here's my data
print(mdfgroup)
County Service ServiceCt
0 Alamance Literacy 1
1 Alamance Technical 1
2 Alamance Vocational 4
3 Chatham Literacy 3
4 Chatham Technical 2
5 Chatham Vocational 1
6 Durham Literacy 1
7 Durham Technical 1
8 Durham Vocational 1
9 Orange Literacy 1
10 Wake Literacy 2
11 Wake Technical 2
So there would be one chart for Alamance with slices for literacy, technical, vocational; a chart for Chatham, Durham, etc. Slice size would be based on ServiceCt.
I've been experimenting with a lot of different approaches but I'm not sure what the most efficient would be. I tried but below it doesn't really break it down by county and it's not resulting in any graphs.
for i, row in enumerate(mdfgroup.itertuples(),1):
plt.figure()
plt.pie(row.ServiceCt,labels=row.Service,
startangle=90,frame=True, explode=0.2,radius=3)
plt.show()
This throws an error:
TypeError: len() of unsized object
and then produces a blank plot box
(I can't embed an image yet so here's the link) Blank Plot Box
Ideally I'd like them to all be subplots, but at this stage I'd take a series of individual plots. The other examples I've found don't deal with unique values for a key (County).
Upvotes: 0
Views: 6650
Reputation: 339280
A common approach is to iterate over the groupby
of a column. Here the column to iterate over is the "Country"
. You may first create a subplot grid with at least as many subplots as you have unique countries. Then you may iterate over the subplots and the groups simultaneously.
At the end there might be some empty subplot(s); those can be set invisible.
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
df = pd.DataFrame({"Country" : list("AAACCCDDDOWW"),
"Service" : list("LTV")*4,
"ServiceCt" : list(map(int, "114321111122"))})
cols = 3
g = df.groupby("Country")
rows = int(np.ceil(len(g)/cols))
fig, axes = plt.subplots(ncols=cols, nrows=rows)
for (c, grp), ax in zip(g, axes.flat):
ax.pie(grp.ServiceCt, labels=grp.Service)
ax.set_title(c)
if len(g) < cols*rows:
for ax in axes.flatten()[len(g):]:
ax.axis("off")
plt.show()
This case is actually well suited to be used with seaborn's FacetGrid
.
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
df = pd.DataFrame({"Country" : list("AAACCCDDDOWW"),
"Service" : list("LTV")*4,
"ServiceCt" : list(map(int, "114321111122"))})
def pie(v, l, color=None):
plt.pie(v, labels=l.values)
g = sns.FacetGrid(df, col="Country")
g.map(pie, "ServiceCt", "Service" )
plt.show()
Finally, one can do all in one line using pandas.
import pandas as pd
import matplotlib.pyplot as plt
df = pd.DataFrame({"Country" : list("AAACCCDDDOWW"),
"Service" : list("LTV")*4,
"ServiceCt" : list(map(int, "114321111122"))})
df.pivot("Service", "Country", "ServiceCt").plot.pie(subplots=True, legend=False)
plt.show()
Upvotes: 5
Reputation: 40697
Is this what you had in mind?
Ncounties = len(mdfgroup.County.unique())
fig, axs = plt.subplots(1, Ncounties, figsize=(3*Ncounties,3), subplot_kw={'aspect':'equal'})
for ax,(groupname,subdf) in zip(axs,mdfgroup.groupby('County')):
ax.pie(subdf.ServiceCt, labels=subdf.Service)
ax.set_title(groupname)
Upvotes: 2