greenhorntechie
greenhorntechie

Reputation: 386

Facing an issue with ggplot

I am having a very simple data frame as below.

  cat_group     total    abort_rate          cancel_rate  success_rate
      100       1804       18.8                45.1         31.8
      200       4118       17.7                30.0         48.3
      500      14041       19.2                16.9         60.0

I am trying to put this data on a plot such that on the x-axis, I will have cat_group and then I would line plot all the other variables total, abort_rate, cancel_rate and success_rate. My idea is to show how each of these variables vary according to the value in cat_group. I would need four lines in total, one for each variable in a different colour

But when I use the below plot function in R, I am seeing the error: geom_path: Each group consist of only one observation. Do you need to adjust the group aesthetic?

ggplot(my_data_frame, aes(category)) + 
  geom_line(aes(y = abort_rate, colour = "abort_rate")) + 
  geom_line(aes(y = success_rate, colour = "success_rate"))+
  geom_line(aes(y = success_rate, colour = "total"))+
  geom_line(aes(y = success_rate, colour = "cancel_rate"))

Any suggestions on how to resolve this issue?

Upvotes: 1

Views: 87

Answers (3)

G. Grothendieck
G. Grothendieck

Reputation: 269566

One easy way to do this is to use autoplot.zoo:

library(ggplot2)
library(zoo)

z <- read.zoo(my_df)
autoplot(z, facet = NULL) + scale_y_log10()

(continued after graph):

screenshot

or for separate panels without a log scale:

autoplot(z) + facet_free()

(continued after graph)

screenshot

Note: Here is the input data in reproducible form:

Lines <- "cat_group     total    abort_rate          cancel_rate  success_rate
      100       1804       18.8                45.1         31.8
      200       4118       17.7                30.0         48.3
      500      14041       19.2                16.9         60.0"
my_df <- read.table(text = Lines, header = TRUE)

Upvotes: 2

LyzandeR
LyzandeR

Reputation: 37879

Assuming that cat_group is of factor type (that's the only way I can reproduce your error) you could do it like this:

my_data_frame$cat_group <- as.factor(my_data_frame$cat_group)

library(ggplot2)
ggplot(my_data_frame, aes(cat_group)) + 
  geom_line(aes(y = abort_rate, colour = "abort_rate", group=1)) + 
  geom_line(aes(y = success_rate, colour = "success_rate", group=1))+
  geom_line(aes(y = success_rate, colour = "total", group=1))+
  geom_line(aes(y = success_rate, colour = "cancel_rate", group=1))

i.e. by specifying one group per geom_line. This has the problem that the scales will not be good enough because they will be set by the first geom_line, and therefore only 2 out of the 4 lines would show.

The typical way of working with such data is to melt the data.frame and then plot it like this:

library(reshape2)
dfm <- melt(my_data_frame, id.vars='cat_group')
ggplot(dfm, aes(x=cat_group, y=value, colour=variable, group=variable)) + geom_line() +
  scale_y_log10()

Notice the scale_y_log10 in order to plot (and actually see) all 4 lines. You probably need a log scale since otherwise you will only be able to see the total which is very big and every other line will be overlapped.

enter image description here

Upvotes: 2

nist
nist

Reputation: 1721

The best way to solve this to regroup your data so that you have one column for the x axis and one for the y axis and one for what type of data that is contained in the row. To do this you can use the tidyr package.

library(tidyr)
plottingData <- df %>% gather(type,value,-cat_group)

ggplot(plottingData,aes(x=cat_group,y=value,color=type)) + geom_line()

Upvotes: 0

Related Questions