Leo
Leo

Reputation: 1168

Plot datetime data in 24 hour window on x axis

I have a dataframe with datetime data:

I want to plot a histogram of the time events start and end, without considering the date (so only the 24 clock).

I have tried to use:

df['Start_time'].dt.time

to just get the time and plot. However I am then unable afterwards to BIN the timestamps (now objects) in 20 bins.

This is my result so far: current result

This is what I am trying to get, a plot with 24hours on the x axis, and the binned distribution of start time & end_time for the y enter image description here

Here is the code

from random import randrange
import datetime
import pandas as pd 
import plotly.express as px

 # make the EXAMPLE dataset 
startDate = datetime.datetime(2013, 9, 20,13,00)
start_lst = [] 
end_lst = []
for i in range(200):

  start_time= startDate + datetime.timedelta(hours=randrange(23), minutes= randrange(60))
  end_time  = start_time + datetime.timedelta(hours=randrange(2,7), minutes= randrange(60))
  startDate = startDate + datetime.timedelta(days=randrange(4))
  start_lst.append(start_time)
  end_lst.append(end_time)

df = pd.DataFrame({'Start_time': start_lst, 
                   'End_time':  end_lst 
                   })
df['Time_diff'] = df['End_time']-df['Start_time']


 #start of code
 #tried just using histogram, but sicne the date changes, it wont plot over 24hours
fig = px.histogram(df, x=['Start_time', 'End_time'], nbins=20)
fig.show()
 #so tried removing the date part, and just leaving time, however now it wont properly bin
df['Start_time_nodate'] = df['Start_time'].dt.time
df['End_time_nodate']   = df['End_time'].dt.time
   
fig = px.histogram(df, x=['Start_time_nodate', 'End_time_nodate'], nbins=20)
fig.show()

Upvotes: 1

Views: 995

Answers (1)

Laurent
Laurent

Reputation: 13488

If I understand correctly, with your example dataframe, here is one way to do it with Matplotlib:

from matplotlib import pyplot as plt

# Setup
df["Start_time_nodate"] = df["Start_time"].dt.hour
df["End_time_nodate"] = df["End_time"].dt.hour
fig, ax = plt.subplots(figsize=(8, 4))

# Plot frequencies
ax.plot(df["Start_time_nodate"].value_counts(sort=False).sort_index())
ax.plot(df["End_time_nodate"].value_counts(sort=False).sort_index())

# Style plot
ax.legend(["Start time", "End time"])
ax.set_xticks(ticks=[i for i in range(0, 25)])
ax.set_xticklabels([i for i in range(0, 25)])
plt.xlabel("24 hours")
plt.ylabel("Frequency")
ax.margins(x=0)

In a Jupyter notebook, this code outputs the following image:

enter image description here

Upvotes: 1

Related Questions