Reputation: 136187
Once in a while I have time data where I would like to just visualize how often events are occurring. So I basically have a list of datetimes and I want to show a plot with
So basically it is a histogram, grouped by hour.
I already have one solution, but how do I make sure that all 24 bins exist? (and it could look nicer, too)
#!/usr/bin/env python
"""Create and visualize date with timestamps."""
# core modules
from datetime import datetime
import random
# 3rd party module
import numpy as np
import pandas as pd
from matplotlib import pyplot as plt
def create_data(num_samples, year, month_p=None, day_p=None):
"""
Create timestamp data.
Parameters
----------
num_samples : int
year: int
month_p : int, optional (default: None)
day_p : int, optional (default: None)
Returns
-------
data : Pandas.Dataframe object
"""
data = []
for _ in range(num_samples):
if month_p is None:
month = random.randint(1, 12)
else:
month = month_p
if day_p is None:
day = random.randint(1, 28)
else:
day = day_p
hour = int(np.random.normal(loc=7) * 3) % 24
minute = random.randint(0, 59)
data.append({'date': datetime(year, month, day, hour, minute)})
data = sorted(data, key=lambda n: n['date'])
return pd.DataFrame(data)
def visualize_data(df):
"""
Plot data binned by hour.
x-axis is the hour, y-axis is the number of datapoints.
Parameters
----------
df : Pandas.Dataframe object
"""
df.groupby(df["date"].dt.hour).count().plot(kind="bar")
plt.show()
df = create_data(2000, 2017)
visualize_data(df)
As you can see, the 7, 9 and 10 are missing.
Upvotes: 1
Views: 4862
Reputation: 6034
Try this function:
def visualize_data(df):
"""
Plot data binned by hour.
x-axis is the hour, y-axis is the number of datapoints.
Parameters
----------
df : Pandas.Dataframe object
"""
y = df.groupby(df["date"].dt.hour).count()
for i in range(24):
y.loc[i] = 0 if i not in y.index else y.loc[i] # Add missing locations.
y.sort_index(inplace = True) # Sort the locations.
y.plot(kind="bar")
plt.show()
Upvotes: 1
Reputation:
reindex the resulting DataFrame with all the values and then call the plot method:
res = df.groupby(df["date"].dt.hour).count().reindex(np.arange(24), fill_value=0)
res.plot(kind="bar")
plt.show()
Upvotes: 4
Reputation: 764
matplotlib.style.use('ggplot')
see - https://pandas.pydata.org/pandas-docs/stable/visualization.html
As you can see, the 7, 9 and 10 are missing.
O events ?
Upvotes: -1