Reputation: 136187
I have a CSV which was generated with this script which looks like this:
date,last_activity
2021-11-03 07:39:14,160
2021-11-03 07:39:44,1594
2021-11-03 07:57:15,4270
2021-11-03 07:57:45,23201
2021-11-03 07:58:15,7
2021-11-03 07:58:45,1015
2021-11-03 07:59:15,2
2021-11-03 07:59:45,3496
2021-11-03 08:28:16,6093
2021-11-03 08:28:46,5513
2021-11-03 08:31:46,16639
I would like to visualize those timestamps as an "activity bar", e.g. like this:
Hence:
date
timestamp.To make it simpler, the last_activity
could be ignored.
The simplest solution I can imagine would be to use one pixel per minute of the day. I can round 2021-11-03 07:39:14
to 2021-11-03 07:39
and just say "I've seen a timestamp for 7:39
-> color that pixel". However, I would only know how to do this directly with matplotlib (pixel-by-pixel). Is there a simpler way with Pandas?
Upvotes: 1
Views: 937
Reputation: 2860
EDIT: I just checked out plotly
and it seems to be built well enough to handle your exact problem. The solution using this library completely knocks out my previous attempt with matplotlib
in my opinion.
plotly
supports using date (so no workaround using timestamps required), and has hover-annotation as well. Here is the code:
import pandas
import dateutil
import plotly.graph_objects as go
# Load and transform the data
filedata = pandas.read_csv("test.csv")
datelist = filedata["date"].to_list()
timestamplist = [dateutil.parser.parse(x) for x in datelist]
length = len(datelist)
# Create the figure
fig = go.Figure()
fig.add_trace(
go.Scatter(x=timestamplist, y=[0] * length, mode="markers", marker_size=20)
)
fig.update_xaxes(showgrid=False)
fig.update_yaxes(
showgrid=False,
zeroline=True,
zerolinecolor="black",
zerolinewidth=3,
showticklabels=False,
)
fig.update_layout(height=200, plot_bgcolor="white", title="My Timeline Title")
fig.show()
And here is the result. Note that X-axis has date markers as you wanted, and the annotation also appears on hovering the mouse pointer over the data points.
Previous/Old answer using matplotlib
:
You can use matplotlib
to plot the timeline. In order to place the marks correctly, we will need to convert it to timestamps.
However, you also want to view the date associated with it. To get around that, I can suggest to use mplcursors
library which prepares an annotation when you click on the datapoint.
You can probably use eventplot
in matplotlib
to plot this. Here's my rookie attempt:
import pandas
import dateutil.parser # For parsing date
import matplotlib.pyplot as plt # for plotting
import mplcursors # For clickable annontation
filedata = pandas.read_csv('test.csv') # Read database
datelist = filedata['date'].to_list()
start = dateutil.parser.parse(datelist[0]).timestamp() # Timestamp of the first date. PS: I am assuming that the data is sorted, and I'm taking the first element only.
# Normalize all timestamps by subtracting the first.
timestamplist = [round(dateutil.parser.parse(x).timestamp() - start) for x in datelist]
plt.title('My timeline plot') # Title of the plot
linept = plt.eventplot(timestamplist, orientation='horizontal') # Insert a line for every timestamp
x = mplcursors.cursor(linept) # Adds annontation to every point
x.connect("add", lambda sel: sel.annotation.set_text(f'{datelist[timestamplist.index(round(sel.target[0]))]}')) # Display corresponding date
#plt.xticks(date(datelist))
plt.tick_params(labelleft=False, left=False) # Remove the Y axis on the left
plt.show() # Display plot
This gives such a plot:
Upvotes: 1