brebs
brebs

Reputation: 234

Plot multiple columns using pandas and plotly timeline

There are multiple questions on plotting multiple graphs but not specifically for pandas and timelines.

I have a dataframe like the below:

Name day_1_start day_1_end day_2_start day_2_end
A 1:00pm 3:00pm 3:30pm 5:30pm
B 11:00am 1:00pm 3:45pm 4:30pm
C 10:00am 11:00am 11:30am 4:30pm

I am trying to plot this into a gantt chart/timeline. Using plotly.express.timeline, is it possible to have multiple x_start and x_end? Namely, can I have day_1_start and day_2_start both be used as x_start and day_1_end and day_2_end used as x_end?

I believe I can also solve this by creating a new table (like the below example) but wondering if it's possible without needing to do this transformation as it's expensive.

Name start end day
A 1:00pm 3:00pm 1
A 3:30pm 5:30pm 2
B 11:00am 1:00pm 1
B 3:45pm 4:30pm 2

This is roughly what I'd like to end up with with day 1 and day 2 shown as different colors - from https://plotly.com/python/gantt/] This is roughly what I'd like to end up with - from https://plotly.com/python/gantt/

Upvotes: 1

Views: 1755

Answers (1)

r-beginners
r-beginners

Reputation: 35230

Since the combination of two time series cannot be treated as a single time series, the wide format needs to be converted to long format. wide_to_long() could not be used to keep multiple columns, so I created a data frame for each target unit, though poorly, and joined them vertically. There may be other, more elegant ways to do this. If I can do it in vertical format, I can draw a timeline with px.timeline() or ff.create_gantt(). ff.create_gannt() has a column name specified, so I changed it.

import pandas as pd
import numpy as np
import io

data = '''
Name day_1_start day_1_end day_2_start day_2_end
A 1:00pm 3:00pm 3:30pm 5:30pm
B 11:00am 1:00pm 3:45pm 4:30pm
C 10:00am 11:00am 11:30am 4:30pm
'''

df = pd.read_csv(io.StringIO(data), delim_whitespace=True)

df_d1 = df[['Name','day_1_start','day_1_end']]
df_d1 = df_d1.assign(day=['day_1']*len(df_d1)).rename(columns={'day_1_start':'start','day_1_end':'end'})

df_d2 = df[['Name','day_2_start','day_2_end']]
df_d2 = df_d2.assign(day=['day_2']*len(df_d2)).rename(columns={'day_2_start':'start','day_2_end':'end'})
df_long = pd.concat([df_d1,df_d2],axis=0, ignore_index=True)

df_long['start'] = pd.to_datetime(df_long['start'])
df_long['end'] = pd.to_datetime(df_long['end'])

px.line

import plotly.express as px
import pandas as pd

fig = px.timeline(df_long, x_start="start", x_end="end", y="Name", color="day")
fig.update_yaxes(autorange="reversed")
fig.show()

enter image description here

ff.create_gannt

import plotly.figure_factory as ff

colors = {'day_1': 'rgb(220, 0, 0)', 'day_2': 'rgb(0, 255, 100)'}
df_long_ff = df_long.copy()
df_long_ff.columns = ['Task', 'Start', 'Finish', 'Resource']
fig = ff.create_gantt(df_long_ff, colors=colors, index_col='Resource', show_colorbar=True, group_tasks=True)
fig.show()

enter image description here

Upvotes: 0

Related Questions