user5813583
user5813583

Reputation: 133

r - plotting gantt chart where multiple periods exist within one category

I borrowed example data from another post and modified to suit my situation. (Add shading to a gantt chart to delineate weekends)

My issue with plotting a Gantt chart is that my data includes multiple periods within the category. In the below sample, I added two more periods for "Write introduction" and one more for "Write results".

Also, I want to colour specific periods that meet the criteria. Here, if any portion of a period falls in either May or August, I flagged it.

It seems having up to two periods in one category works fine. But when there are three periods, they become merged into a single period?!

My real dataset is much more complicated with one category sometimes having more than 10 periods, with very defined criteria for flagging.

I'm not sure how I could address this issue.

enter image description here


require(reshape2)
require(ggplot2)


# Create a list of tasks name strings.
tasks <- c("Write introduction", "Parse citation data",
           "Construct data timeline",
           "Write methods", "Model formulation", 
           "Model selection", "Write results", "Write discussion",
           "Write abstract and editing",
           
           "Write introduction", "Write introduction", "Write results")

# Compile dataframe of task names, and respective start and end dates.
dfr <- data.frame(
  name = factor(tasks, levels = tasks[1:9]),
  start.date = as.Date(c("2018-04-09", "2018-04-09", "2018-04-16",
                         "2018-04-30", "2018-04-16", "2018-05-21",
                         "2018-06-04", "2018-07-02", "2018-07-30",
                         
                         "2018-05-15", "2018-06-03", "2018-07-25"
                         )),
  end.date = as.Date(c("2018-04-30", "2018-04-20", "2018-05-18",
                       "2018-06-01", "2018-05-18", "2018-06-01",
                       "2018-06-29", "2018-07-27", "2018-08-31",
                       
                       "2018-05-29", "2018-06-20", "2018-08-15")),
  flag = c(0, 0, 1, 
           1, 1, 1,
           0, 0, 1, 
           1, 0, 1)
)

# Merge start and end dates into durations.
mdfr <- melt(dfr, measure.vars = c("start.date", "end.date"))



# gannt chart

ggplot(mdfr) +
  
  geom_line(aes(value, name, colour = as.factor(flag)), size = 4) +
  labs(title = "Project gantt chart",
       x = NULL,
       y = NULL) +
  theme_minimal() 

Upvotes: 1

Views: 3429

Answers (2)

Z.Lin
Z.Lin

Reputation: 29095

The geom_line approach should work if you retain an identifier for each pair of start / end dates, and specify that as the grouping variable so that ggplot knows which coordinates belong to the same line:

dfr$row.id <- seq(1, nrow(dfr)) # use row id to identify each original row uniquely
mdfr <- melt(dfr, measure.vars = c("start.date", "end.date"))

ggplot(mdfr,
       aes(x = value, y = name, colour = factor(flag), group = row.id)) + # add group aesthetic
  geom_line(size = 4) +
  labs(title = "Project gantt chart",
       x = NULL,
       y = NULL) +
  theme_minimal() 

plot

Upvotes: 2

user5813583
user5813583

Reputation: 133

I solved it by skipping melt and using geom_linerange instead of geom_line

ggplot(dfr) +
  geom_linerange(aes(y = name, 
                     xmin = start.date,
                     xmax = end.date,
                     colour = as.factor(flag)),
                 
                 size = I(5)) +
  theme_minimal()

Still, it would be good to know why it didn't work with geom_line. If anyone could help me with that, I'd appreciate it very much!

enter image description here

Upvotes: 2

Related Questions