Calculus
Calculus

Reputation: 781

Pandas multiple bar charts with 2 columns on X-axis

I'm trying to make an array of bar charts, one chart per city with each chart displaying the counts on the Y-Axis (ranging 70 - 210) and on the X-axis I'd like to have 21 bars, one for each weekday AND time slot combination (7x3=21). This is the data

import pandas as pd
import matplotlib.pyplot as plt

data = [
    ['CITY','DAY','TIME_BIN', 'COUNT'],
    ['PHOENIX', "Friday", 1, 70],
    ['PHOENIX', "Thursday", 2, 80],
    ['PHOENIX', "Wednesday", 3, 90],
    ['ATLANTA', "Sunday", 1, 130],
    ['ATLANTA', "Monday", 2, 150],
    ['ATLANTA', "Tuesday", 3, 160],
    ['CHICAGO', "Saturday", 1, 180],
    ['CHICAGO', "Friday", 2, 200],
    ['CHICAGO', "Friday", 3, 210],
]
df = pd.DataFrame(data[1:],columns=data[0])
print(df)

          CITY        DAY  TIME_BIN  COUNT
0  PHOENIX     Friday         1     70
1  PHOENIX   Thursday         2     80
2  PHOENIX  Wednesday         3     90
3  ATLANTA     Sunday         1    130
4  ATLANTA     Monday         2    150
5  ATLANTA    Tuesday         3    160
6  CHICAGO   Saturday         1    180
7  CHICAGO     Friday         2    200
8  CHICAGO     Friday         3    210

I want the output to be some combination of the two attempts below. Combine the array functionality but with bar charts.

# Successful attempt at making an array of charts but wrong type 
df[['DAY', 'TIME_BIN']].hist(by=df['CITY'])
plt.show()

enter image description here

# Bar chart with proper counts but x-axis did not combine properly
ax = df.plot(x=['DAY', 'TIME_BIN'],
                                                       y='COUNT',
                                                       kind='bar',
                                                       color=["g","b"])
plt.show()

enter image description here

Upvotes: 1

Views: 4870

Answers (1)

ImportanceOfBeingErnest
ImportanceOfBeingErnest

Reputation: 339062

An easy solution to plot such categorical data with an additional parameter is to use searborn.

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

data = [
    ['CITY','DAY','TIME_BIN', 'COUNT'],
    ['PHOENIX', "Friday", 1, 70],
    ['PHOENIX', "Thursday", 2, 80],
    ['PHOENIX', "Wednesday", 3, 90],
    ['ATLANTA', "Sunday", 1, 130],
    ['ATLANTA', "Monday", 2, 150],
    ['ATLANTA', "Tuesday", 3, 160],
    ['CHICAGO', "Saturday", 1, 180],
    ['CHICAGO', "Friday", 2, 200],
    ['CHICAGO', "Friday", 3, 210],
]
df = pd.DataFrame(data[1:],columns=data[0])

g = sns.factorplot(x="DAY", y ='COUNT', hue='TIME_BIN', col="CITY", col_wrap=3,
                   data=df,
                   kind="bar", size=3, aspect=.8)
g.set_xticklabels(rotation=30, ha="right")
plt.tight_layout()
plt.show()

enter image description here


Using pandas you can use the subplots=True argument on a dataframe with several columns, this would give you one subplot per column. To this end one would first create a MultiIndex from the "DAY" and "Time_bin" column and then pivot about the "CITY" column.

import pandas as pd
import matplotlib.pyplot as plt

data = [
    ['CITY','DAY','TIME_BIN', 'COUNT'],
    ['PHOENIX', "Friday", 1, 70],
    ['PHOENIX', "Thursday", 2, 80],
    ['PHOENIX', "Wednesday", 3, 90],
    ['ATLANTA', "Sunday", 1, 130],
    ['ATLANTA', "Monday", 2, 150],
    ['ATLANTA', "Tuesday", 3, 160],
    ['CHICAGO', "Saturday", 1, 180],
    ['CHICAGO', "Friday", 2, 200],
    ['CHICAGO', "Friday", 3, 210],
]
df = pd.DataFrame(data[1:],columns=data[0])
df.set_index(['DAY','TIME_BIN'], inplace=True)

piv = df.pivot(columns="CITY").plot(kind="bar", subplots=True)
plt.tight_layout()
plt.show()

enter image description here

Upvotes: 2

Related Questions