Reputation: 149
I'm planning on using the python library 'Plotly' to build a Gantt Chart. Specifically this: https://plotly.com/python/gantt/#group-tasks-together.
However, each "Job" could have multiple tasks and these tasks could be running in parallel. From what I have observed Plotly doesn’t stack tasks running in parallel on top of each other making it incredibly hard to read the chart. Here is an example where "Job A" has two tasks running in parallel but only one is visible:
data = [dict(Task="Job A", Start='2009-01-01', Finish='2009-02-28'),
dict(Task="Job A", Start='2009-01-01', Finish='2009-02-28'),
dict(Task="Job B", Start='2009-03-05', Finish='2009-04-15'),
dict(Task="Job C", Start='2009-02-20', Finish='2009-05-30')]
# Without group_tasks=True, There would be two separate "Job A" labels
fig = ff.create_gantt(data, group_tasks=True)
fig.show()
What I want is both "Job A" tasks to be visible but stacked vertically with "Job A" sitting in the center of the vertical space taken up by its tasks. Something like this but without two "Job A" labels:
If anyone has any library recommendations I should consider for my Gantt Chart project please feel free to share! Thank you!
Upvotes: 1
Views: 1637
Reputation: 21
import plotly.express as px
import pandas as pd
data = [dict(Task="Job A", Start='2009-01-01', Finish='2009-02-28'),
dict(Task="Job A", Start='2009-01-01', Finish='2009-02-28'),
dict(Task="Job B", Start='2009-03-05', Finish='2009-04-15'),
dict(Task="Job C", Start='2009-02-20', Finish='2009-05-30')]
df = pd.DataFrame(data)
df['JobNum'] = ""
df.loc[0,'JobNum'] = 1
for idx in range(1,df.shape[0]):
if df.loc[idx-1,'Task'] == df.loc[idx,'Task']:
df.loc[idx,'JobNum'] = df.loc[idx-1,'JobNum'] + 1
else:
df.loc[idx,'JobNum'] = 1
df['hoverName'] = df.apply(lambda x: x['Task'] + "|" + str(x['JobNum']), axis=1)
Approach 1: Using Facet row
fig = px.timeline(df
, x_start="Start"
, x_end="Finish"
, y="Task"
, hover_name= "Task"
, color_discrete_sequence=px.colors.qualitative.Prism
, opacity=.7
, template='plotly_white'
, color='Task'
, facet_row= 'JobNum'
, hover_data = ['Start','Finish']
)
fig.show()
Appraoch 2: Adjusting width and offset. This needs to be generalized when the there are more than two parallel tasks.
fig = px.timeline(df
, x_start="Start"
, x_end="Finish"
, y="Task"
, hover_name= "hoverName"
, color_discrete_sequence=px.colors.qualitative.Prism
, opacity=.7
, template='plotly_white'
, color='JobNum'
, hover_data = ['Start','Finish']
)
for obj in fig.data:
Task, JobNum = obj.hovertext[0].split("|")
if (int(JobNum) == 1):
obj.width = 0.1
obj.offset = 0.05
elif (int(JobNum) == 2):
obj.width = 0.1
obj.offset = -0.05
fig.show()
Upvotes: 2
Reputation: 19610
A starting point would be to use fig.add_shape
to add an identical Task
as a rectangle below the original Task
.
To do this, we need the y-coordinates of each rectangle, but conveniently, the first bar will be at y=0, the second bar at y=1, and so on. Therefore, the index of the unique tasks listed in order is also the y-coordinate (The unique tasks are [Job A, Job B, Job C]
so the Job C bar will be centered at y=3
). The default width of each bar is 0.8, so y1 should end at y0-0.4
if y0 is the starting y-coordinate of the bar.
Note that there won't be any hovertemplate for the annotated shapes and the color is the same for each bar the way it is written currently.
import numpy as np
import pandas as pd
import plotly.express as px
import plotly.graph_objects as go
## added additional duplicate Task to demonstrate generalizability
data = [dict(Task="Job A", Start='2009-01-01', Finish='2009-02-28'),
dict(Task="Job A", Start='2009-01-01', Finish='2009-02-28'),
dict(Task="Job B", Start='2009-03-05', Finish='2009-04-15'),
dict(Task="Job C", Start='2009-02-20', Finish='2009-05-30'),
dict(Task="Job C", Start='2009-02-20', Finish='2009-05-30')]
df = pd.DataFrame(data)
# Without group_tasks=True, There would be two separate "Job A" labels
# fig = ff.create_gantt(data, group_tasks=True)
## plot the non-duplicate rows
fig = px.timeline(df.loc[~df['Task'].duplicated()], x_start="Start", x_end="Finish", y="Task")
## plot the duplicate rows using rectangular shapes
for row in df.loc[df['Task'].duplicated()].itertuples():
y_val = np.where(df.Task.unique()==row[1])[0][0]
# print(f"found {row[1]} at index {y_val}")
fig.add_shape(type="rect",
xref="x", yref="y",
x0=row[2], x1=row[3],
y0=y_val, y1=y_val-0.4,
line_width=0,
fillcolor="salmon",
)
fig.show()
Upvotes: 2