Reputation: 21981
I have the following data:
Year LandUse Region Area
0 2005 Corn LP 2078875
1 2005 Corn UP 149102.4
2 2005 Open Lands LP 271715
3 2005 Open Lands UP 232290.1
4 2005 Soybeans LP 1791342
5 2005 Soybeans UP 50799.12
6 2005 Other Ag LP 638010.4
7 2005 Other Ag UP 125527.2
8 2005 Forests/Wetlands LP 69629.86
9 2005 Forests/Wetlands UP 26511.43
10 2005 Developed LP 10225.56
11 2005 Developed UP 1248.442
12 2010 Corn LP 2303999
13 2010 Corn UP 201977.2
14 2010 Open Lands LP 131696.3
15 2010 Open Lands UP 45845.81
16 2010 Soybeans LP 1811186
17 2010 Soybeans UP 66271.21
18 2010 Other Ag LP 635332.9
19 2010 Other Ag UP 257439.9
20 2010 Forests/Wetlands LP 48124.43
21 2010 Forests/Wetlands UP 23433.76
22 2010 Developed LP 7619.853
23 2010 Developed UP 707.4816
How do I use pandas to make a stacked bar plot that shows area on y-axis and uses 'REGION' to construct the stacks and uses YEAR and LandUse on x-axis.
Upvotes: 0
Views: 73
Reputation: 60130
The main thing with pandas plots is figuring out which shape pandas expects the data to be in. If we reshape so that Year is in the index and different regions are in different columns:
# Assuming that we want to sum the areas for different
# LandUse's within each region
plot_table = df.pivot_table(index='Year', columns='Region',
values='Area', aggfunc='sum')
plot_table
Out[39]:
Region LP UP
Year
2005 4859797.820 585478.6920
2010 4937958.483 595675.3616
The plotting happens pretty straightforwardly:
plot_table.plot(kind='bar', stacked=True)
Having both Year and LandUse on the x-axis doesn't require much extra work, you can put both in the index when creating the table for plotting:
plot_table = df.pivot_table(index=['Year', 'LandUse'],
columns='Region',
values='Area', aggfunc='sum')
Upvotes: 1