Reputation: 65
I want to create a bar chart with a focus on two cities. My data set is similar to this.
city rate Bedrooms
Houston 132.768382 0
Dallas 151.981043 1
Dallas 112.897727 3
Houston 132.332665 1
Houston 232.611185 2
Dallas 93.530662 4
I've broken them up into a dataframe of just Dallas and Houston. Like
dal.groupby('bedrooms')['rate'].mean().plot(kind='bar')
&
hou.groupby('bedrooms')['rate'].mean().plot(kind='bar')
How would I go about making a bar chart that lists average rate of listings based on bedroom type. Something similar to this image below that I found here Python matplotlib multiple bars. With the labels being the cities.
I'd appreciate any help!
Upvotes: 1
Views: 2000
Reputation: 13447
There is an easy solution using one line of pandas
(as long you rearrange the data first) only or using plotly
import pandas as pd
df = pd.DataFrame({'city': {0: 'Houston',
1: 'Dallas',
2: 'Dallas',
3: 'Houston',
4: 'Houston',
5: 'Dallas'},
'rate': {0: 132.768382,
1: 151.981043,
2: 112.897727,
3: 132.332665,
4: 232.611185,
5: 93.530662},
'Bedrooms': {0: 0, 1: 1, 2: 3, 3: 1, 4: 2, 5: 4}})
# groupby
df = df.groupby(["city", "Bedrooms"])["rate"].mean().reset_index()
With pivot_table
we can rearrange our data
pv = pd.pivot_table(df,
index="Bedrooms",
columns="city",
values="rate")
city Dallas Houston
Bedrooms
0 NaN 132.768382
1 151.981043 132.332665
2 NaN 232.611185
3 112.897727 NaN
4 93.530662 NaN
And then plot in one line only.
pv.plot(kind="bar");
import plotly.express as px
px.bar(df, x="Bedrooms", y="rate", color="city",barmode='group')
Upvotes: 1
Reputation: 4021
Seaborn is your friend in this case, first create a grouped dataframe with the average rate
per City and bedrooms and the plot it with seaborn
import seaborn as sns
dal_group = dal.groupby(['city' , 'Bedrooms']).agg({'rate': 'mean'}).reset_index()
sns.barplot(data=dal_group, x='Bedrooms', y='rate', hue='city')
with the data above, it will produce this plot:
Upvotes: 3
Reputation: 10545
Here's a basic way to do it in matplotlib:
import numpy as np
import matplotlib.pyplot as plt
data_dallas = dal.groupby('bedrooms')['rate'].mean()
data_houston = hou.groupby('bedrooms')['rate'].mean()
fig, ax = plt.subplots()
x = np.arange(5) # if the max. number of bedrooms is 4
width = 0.35 # width of one bar
dal_bars = ax.bar(x, data_dallas, width)
hou_bars = ax.bar(x + width, data_houston, width)
ax.set_xticks(x + width / 2)
ax.set_xticklabels(x)
ax.legend((dal_bars[0], hou_bars[0]), ('Dallas', 'Houston'))
plt.show()
Upvotes: 1
Reputation: 442
You can read more here: https://pythonspot.com/matplotlib-bar-chart/
import numpy as np
import matplotlib.pyplot as plt
# data to plot
n_groups = # of data points for each
mean_rates_houston = [average rates of bedrooms for Houston]
mean_rates_dallas = [average rates of bedrooms for Dalls]
# create plot
fig, ax = plt.subplots()
index = np.arange(n_groups)
bar_width = 0.35
opacity = 0.8
rects1 = plt.bar(index, mean_rates_dallas, bar_width,
alpha=opacity,
color='b',
label='Dallas')
rects2 = plt.bar(index + bar_width, mean_rates_houston, bar_width,
alpha=opacity,
color='g',
label='Houston')
plt.xlabel('City')
plt.ylabel('Rates')
plt.title('Bedroom Rates per City')
# whatever the number of bedrooms in your dataset might be: change plt.xticks
plt.xticks(index + bar_width, ('0', '1', '2', '3'))
plt.legend()
plt.tight_layout()
plt.show()
Upvotes: 0