SalicBlu3
SalicBlu3

Reputation: 1904

ggplot2 - Draw Line Graph on Categorical Variable

Context: I want to represent the journey through a process.

Sample Data:

 ____________________________________________________________________
|       Time       |   ProcessNo (factor)   |  ProcessName (factor)  |
+------------------+------------------------+------------------------+
|    2014-08-01    |           1            |      Brainstorming     |
|    2014-08-03    |           2            |      Estimation        |
|    2014-08-04    |           1            |      Brainstorming     |
|    2014-08-09    |           3            |      Construction      |
|    2014-08-14    |           4            |      Rectifying        |
+--------------------------------------------------------------------+

I've drawn a plot using ggplot2 with x = Time, y = ProcessName.

p <- ggplot(dfCheckpoints, aes(Time, ProcessName))
p <- p + geom_point()

Problem: I'd like to overlay a line joining the Processes in chronological order. The ProcessNo is simply the factor level of the ProcessName variable if it is any help.

I've tried adding a new line:

p <- p + geom_line(data = dfCheckpoints, aes(x = Time, y = ProcessNo))

But it adds extra factors on the Y-Axis.

If there is another way, I'm happy to try it too.

Thanks in advance!

Upvotes: 1

Views: 2432

Answers (1)

r.bot
r.bot

Reputation: 5424

Looking at the online documentation for geom_line, I think you need a grouping variable in there. This gives what I think you have asked for.

require("ggplot2")
require("lubridate")

dfCheckpoints$Time <- ymd(dfCheckpoints$Time)
dfCheckpoints$ProcessName <- as.character(dfCheckpoints$ProcessName)
dfCheckpoints$group <- 1

p <- ggplot(dfCheckpoints, aes(Time, ProcessName, group = group)) + 
  geom_point() + geom_line()
p

And for anyone else trying this, here's the dput() of my interpretation of the data:

structure(list(Time = structure(1:5, .Label = c("2014-08-01", 
"2014-08-03", "2014-08-04", "2014-08-09", "2014-08-14"), class = "factor"), 
    ProcessNo = structure(c(1L, 2L, 1L, 3L, 4L), .Label = c("1", 
    "2", "3", "4"), class = "factor"), ProcessName = structure(c(1L, 
    3L, 1L, 2L, 4L), .Label = c("Brainstorming", "Construction", 
    "Estimation", "Rectifying"), class = "factor")), .Names = c("Time", 
"ProcessNo", "ProcessName"), row.names = c(NA, -5L), class = "data.frame")

Upvotes: 1

Related Questions