Chris Worth
Chris Worth

Reputation: 49

How to change a seaborn histogram plot to work for hours of the day?

I have a pandas dataframe with lots of time intervals of varying start times and lengths. I am interested in the distribution of start times over 24hours. I therefore have another column entitled Hour with just that in. I have plotted a histogram using seaborn to look at the distribution but obviously the x axis starts at 0 and runs to 24. I wonder if there is a way to change so it runs from 8 to 8 and loops over at 23 to 0 so it provides a better visualisation of my data from a time perspective. Thanks in advance.

sns.distplot(df2['Hour'], bins = 24, kde = False).set(xlim=(0,23))

enter image description here

Upvotes: 0

Views: 2692

Answers (1)

gherka
gherka

Reputation: 1446

If you want to have a custom order of x-values on your bar plot, I'd suggest using matplotlib directly and plot your histogram simply as a bar plot with width=1 to get rid of padding between bars.

import pandas as pd
import numpy as np
from datetime import datetime
import matplotlib.pyplot as plt

# prepare sample data
dates = pd.date_range(
    start=datetime(2020, 1, 1),
    end=datetime(2020, 1, 7),
    freq="H")

random_dates = np.random.choice(dates, 1000)

df = pd.DataFrame(data={"date":random_dates})

df["hour"] = df["date"].dt.hour

# set your preferred order of hours
hour_order = [8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,0,1,2,3,4,5,6,7]

# calculate frequencies of each hour and sort them
plot_df = (
    df["hour"]
    .value_counts()
    .rename_axis("hour", axis=0)
    .reset_index(name="freq")
    .set_index("hour")
    .loc[hour_order]
    .reset_index())

# day / night colour split
day_mask = ((8 <= plot_df["hour"]) & (plot_df["hour"] <= 20))
plot_df["color"] = np.where(day_mask, "skyblue", "midnightblue")

# actual plotting - note that you have to cast hours as strings
fig = plt.figure(figsize=(8,4))
ax = fig.add_subplot(111)

ax.bar(
    x=plot_df["hour"].astype(str),
    height=plot_df["freq"],
    color=plot_df["color"], width=1)

ax.set_xlabel('Hour')
ax.set_ylabel('Frequency')

plt.show()

enter image description here

Upvotes: 2

Related Questions