Reputation: 781
I'm trying to make an array of bar charts, one chart per city with each chart displaying the counts on the Y-Axis (ranging 70 - 210) and on the X-axis I'd like to have 21 bars, one for each weekday AND time slot combination (7x3=21). This is the data
import pandas as pd
import matplotlib.pyplot as plt
data = [
['CITY','DAY','TIME_BIN', 'COUNT'],
['PHOENIX', "Friday", 1, 70],
['PHOENIX', "Thursday", 2, 80],
['PHOENIX', "Wednesday", 3, 90],
['ATLANTA', "Sunday", 1, 130],
['ATLANTA', "Monday", 2, 150],
['ATLANTA', "Tuesday", 3, 160],
['CHICAGO', "Saturday", 1, 180],
['CHICAGO', "Friday", 2, 200],
['CHICAGO', "Friday", 3, 210],
]
df = pd.DataFrame(data[1:],columns=data[0])
print(df)
CITY DAY TIME_BIN COUNT
0 PHOENIX Friday 1 70
1 PHOENIX Thursday 2 80
2 PHOENIX Wednesday 3 90
3 ATLANTA Sunday 1 130
4 ATLANTA Monday 2 150
5 ATLANTA Tuesday 3 160
6 CHICAGO Saturday 1 180
7 CHICAGO Friday 2 200
8 CHICAGO Friday 3 210
I want the output to be some combination of the two attempts below. Combine the array functionality but with bar charts.
# Successful attempt at making an array of charts but wrong type
df[['DAY', 'TIME_BIN']].hist(by=df['CITY'])
plt.show()
# Bar chart with proper counts but x-axis did not combine properly
ax = df.plot(x=['DAY', 'TIME_BIN'],
y='COUNT',
kind='bar',
color=["g","b"])
plt.show()
Upvotes: 1
Views: 4870
Reputation: 339062
An easy solution to plot such categorical data with an additional parameter is to use searborn.
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
data = [
['CITY','DAY','TIME_BIN', 'COUNT'],
['PHOENIX', "Friday", 1, 70],
['PHOENIX', "Thursday", 2, 80],
['PHOENIX', "Wednesday", 3, 90],
['ATLANTA', "Sunday", 1, 130],
['ATLANTA', "Monday", 2, 150],
['ATLANTA', "Tuesday", 3, 160],
['CHICAGO', "Saturday", 1, 180],
['CHICAGO', "Friday", 2, 200],
['CHICAGO', "Friday", 3, 210],
]
df = pd.DataFrame(data[1:],columns=data[0])
g = sns.factorplot(x="DAY", y ='COUNT', hue='TIME_BIN', col="CITY", col_wrap=3,
data=df,
kind="bar", size=3, aspect=.8)
g.set_xticklabels(rotation=30, ha="right")
plt.tight_layout()
plt.show()
Using pandas you can use the subplots=True
argument on a dataframe with several columns, this would give you one subplot per column. To this end one would first create a MultiIndex from the "DAY" and "Time_bin" column and then pivot about the "CITY" column.
import pandas as pd
import matplotlib.pyplot as plt
data = [
['CITY','DAY','TIME_BIN', 'COUNT'],
['PHOENIX', "Friday", 1, 70],
['PHOENIX', "Thursday", 2, 80],
['PHOENIX', "Wednesday", 3, 90],
['ATLANTA', "Sunday", 1, 130],
['ATLANTA', "Monday", 2, 150],
['ATLANTA', "Tuesday", 3, 160],
['CHICAGO', "Saturday", 1, 180],
['CHICAGO', "Friday", 2, 200],
['CHICAGO', "Friday", 3, 210],
]
df = pd.DataFrame(data[1:],columns=data[0])
df.set_index(['DAY','TIME_BIN'], inplace=True)
piv = df.pivot(columns="CITY").plot(kind="bar", subplots=True)
plt.tight_layout()
plt.show()
Upvotes: 2