Reputation: 123
Here is my Python code which basically plots a Gantt chart:
import pandas as pd
import random
from datetime import datetime
import matplotlib.dates as mdates
import matplotlib.pyplot as plt
%matplotlib inline
import math
plt.style.use('ggplot')
df = pd.read_csv('zpp00141_new.csv')
def timestr_to_num(timestr):
return mdates.date2num(datetime.strptime('0' + timestr if timestr[1] == ':' else timestr, '%I:%M:%S %p'))
df.rename(columns={"Earl. start / time": "start", "Latest finish / time": "finish"}, inplace = True)
df['Operation/Activity'] = df['Operation/Activity'].astype(str)
fig, ax = plt.subplots(figsize=(10, 5))
operations = pd.unique(df['Operation/Activity'])
#df.assign(start=df['Earl. start / time'])
colors = plt.cm.tab10.colors # get a list of 10 colors
colors *= math.ceil(len(operations) / (len(colors))) # repeat the list as many times as needed
for operation, color in zip(operations, colors):
for row in df[df['Operation/Activity'] == operation].itertuples():
left = timestr_to_num(row.start)
right = timestr_to_num(row.finish)
ax.barh(operation, left=left, width=right - left, height=3, color=color)
ax.set_xlim(timestr_to_num('07:00:00 AM'), timestr_to_num('4:30:00 PM'))
ax.xaxis.set_major_formatter(mdates.DateFormatter('%H:%M')) # display ticks as hours and minutes
ax.xaxis.set_major_locator(mdates.HourLocator(interval=1)) # set a tick every hour
ax.set_xlabel('Time')
ax.set_ylabel('Operation')
plt.tight_layout()
plt.show()
You can see the output on the attached picture:
I would like to plot a vertical straight line that would correspond with a current time on the x-axis. I tried to add this to my code to plot it but I can't figure out how to make it work. I assume there might an issue with my time formatting or something like that:
plt.axvline(pd.Timestamp.now(),color='r')
I would really appreciate any assistance in this matter.
Here is a picture of the desired output, I want my plot to be alike:
Also, I would like to add/append another category to my y-axis "Operation short text" along with the "Operation/Activity"# so it would not only show the operation number, but would also reflect the description of the operation next to it. To have a grasp of what my data looks like,see below (first row is a header):
Operation short text,Operation/Activity,Earl. start / time,Latest finish / time
Mount right racks,0250,7:00:00 AM,9:22:00 AM
Mount right side motion unit carriage,0251,9:22:00 AM,10:30:00 AM
Mount left side motion unit carriage,0252,10:30:00 AM,11:17:00 AM
Install motion unit complete,0253,11:17:00 AM,1:01:00 PM
Move machine to next step + EPA,0254,1:01:00 PM,3:30:00 PM
Mount Left Racks,0200,7:00:00 AM,9:12:00 AM
Mount cable motor & Lubricate guide carr,0201,9:12:00 AM,9:44:00 AM
Mount suction components,0202,9:44:00 AM,11:04:00 AM
Mount extraction,0203,11:04:00 AM,12:34:00 PM
Mount temporary diamond plates,0204,12:34:00 PM,1:04:00 PM
Mount piping inside,0205,1:04:00 PM,1:44:00 PM
Move Machine to next step + EPA,0206,1:44:00 PM,3:30:00 PM
Upvotes: 2
Views: 4368
Reputation: 80329
The easiest seems to sort the dataframe by operation and then plot the horizontal bars using the index of the dataframe as the y-coordinate. Then, reversing the limits of the y-axis (setting it from high to low), gets the lowest numbered operation on top. (The code now assumes that each bar will be on a new line, while the old code assumed there would be more bars for one operation).
As the operations now seem to belong together, a colormap with sequential colors is choosen, and the colors are started again each time an operation starts earlier than the previous. Feel free to use any scheme that suits your goals.
As the datetime.strptime
only looks to the time, it gets a default date (January 1st, 1900). So your approach to use the same conversion for the 'now' time is very adecuate.
Note that pd.read_csv
's type sniffer gives a float format to the operations column. You could prevent this giving it explicit conversion information. E.g. pd.read_csv(..., converters={1: str})
to have the second column as string.
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
from datetime import datetime
import pandas as pd
import math
# % matplotlib inline
def timestr_to_num(timestr):
return mdates.date2num(datetime.strptime('0' + timestr if timestr[1] == ':' else timestr, '%I:%M:%S %p'))
plt.style.use('ggplot')
# df = pd.read_csv('zpp00141_new.csv')
columns = ['Operation short text', 'Operation/Activity', 'Earl. start / time', 'Latest finish / time']
rows = [['Mount right racks', '0250', '7:00:00 AM', '9:22:00 AM'],
['Mount right side motion unit carriage', '0251', '9:22:00 AM', '10:30:00 AM'],
['Mount left side motion unit carriage', '0252', '10:30:00 AM', '11:17:00 AM'],
['Install motion unit complete', '0253', '11:17:00 AM', '1:01:00 PM'],
['Move machine to next step + EPA', '0254', '1:01:00 PM', '3:30:00 PM'],
['Mount Left Racks', '0200', '7:00:00 AM', '9:12:00 AM'],
['Mount cable motor & Lubricate guide carr', '0201', '9:12:00 AM', '9:44:00 AM'],
['Mount suction components', '0202', '9:44:00 AM', '11:04:00 AM'],
['Mount extraction', '0203', '11:04:00 AM', '12:34:00 PM'],
['Mount temporary diamond plates', '0204', '12:34:00 PM', '1:04:00 PM'],
['Mount piping inside', '0205', '1:04:00 PM', '1:44:00 PM'],
['Move Machine to next step + EPA', '0206', '1:44:00 PM', '3:30:00 PM']]
df = pd.DataFrame(data=rows, columns=columns)
df.rename(columns={"Earl. start / time": "start", "Latest finish / time": "finish"}, inplace=True)
df['Operation/Activity'] = df['Operation/Activity'].astype(int)
df.sort_values('Operation/Activity', ascending=True, inplace=True, ignore_index=True)
fig, ax = plt.subplots(figsize=(10, 5))
#colors = plt.cm.tab10.colors # get a list of 10 colors
cmap = plt.cm.get_cmap('plasma_r')
colors = [cmap(i/9) for i in range(10)] # get a list of 10 colors
previous_start = math.inf # 'previous_start' helps to indicate we're starting again from the left
color_start = 0
for row in df.itertuples():
left = timestr_to_num(row.start)
right = timestr_to_num(row.finish)
if left <= previous_start:
color_start = row.Index
ax.barh(row.Index, left=left, width=right - left, height=1, color=colors[(row.Index - color_start) % len(colors)])
previous_start = left
ax.set_xlim(timestr_to_num('7:00:00 AM'), timestr_to_num('4:30:00 PM'))
ax.xaxis.set_major_formatter(mdates.DateFormatter('%H:%M')) # display ticks as hours and minutes
ax.xaxis.set_major_locator(mdates.HourLocator(interval=1)) # set a tick every hour
ax.set_xlabel('Time')
ax.set_ylabel('Operation')
ax.set_ylim(len(df), -1) # set the limits and reverse the order
ax.set_yticks(range(len(df)))
# ax.set_yticklabels(list(df['Operation/Activity']))
ax.set_yticklabels(list(df['Operation short text']))
now = datetime.now().strftime('%I:%M:%S %p')
ax.axvline(x=timestr_to_num(now),color='r')
plt.tight_layout()
plt.show()
Upvotes: 2