Reputation: 386
I am having a very simple data frame as below.
cat_group total abort_rate cancel_rate success_rate
100 1804 18.8 45.1 31.8
200 4118 17.7 30.0 48.3
500 14041 19.2 16.9 60.0
I am trying to put this data on a plot such that on the x-axis, I will have cat_group and then I would line plot all the other variables total, abort_rate, cancel_rate and success_rate. My idea is to show how each of these variables vary according to the value in cat_group. I would need four lines in total, one for each variable in a different colour
But when I use the below plot function in R, I am seeing the error: geom_path: Each group consist of only one observation. Do you need to adjust the group aesthetic?
ggplot(my_data_frame, aes(category)) +
geom_line(aes(y = abort_rate, colour = "abort_rate")) +
geom_line(aes(y = success_rate, colour = "success_rate"))+
geom_line(aes(y = success_rate, colour = "total"))+
geom_line(aes(y = success_rate, colour = "cancel_rate"))
Any suggestions on how to resolve this issue?
Upvotes: 1
Views: 87
Reputation: 269566
One easy way to do this is to use autoplot.zoo
:
library(ggplot2)
library(zoo)
z <- read.zoo(my_df)
autoplot(z, facet = NULL) + scale_y_log10()
(continued after graph):
or for separate panels without a log scale:
autoplot(z) + facet_free()
(continued after graph)
Note: Here is the input data in reproducible form:
Lines <- "cat_group total abort_rate cancel_rate success_rate
100 1804 18.8 45.1 31.8
200 4118 17.7 30.0 48.3
500 14041 19.2 16.9 60.0"
my_df <- read.table(text = Lines, header = TRUE)
Upvotes: 2
Reputation: 37879
Assuming that cat_group
is of factor type (that's the only way I can reproduce your error) you could do it like this:
my_data_frame$cat_group <- as.factor(my_data_frame$cat_group)
library(ggplot2)
ggplot(my_data_frame, aes(cat_group)) +
geom_line(aes(y = abort_rate, colour = "abort_rate", group=1)) +
geom_line(aes(y = success_rate, colour = "success_rate", group=1))+
geom_line(aes(y = success_rate, colour = "total", group=1))+
geom_line(aes(y = success_rate, colour = "cancel_rate", group=1))
i.e. by specifying one group per geom_line
. This has the problem that the scales will not be good enough because they will be set by the first geom_line, and therefore only 2 out of the 4 lines would show.
The typical way of working with such data is to melt the data.frame and then plot it like this:
library(reshape2)
dfm <- melt(my_data_frame, id.vars='cat_group')
ggplot(dfm, aes(x=cat_group, y=value, colour=variable, group=variable)) + geom_line() +
scale_y_log10()
Notice the scale_y_log10
in order to plot (and actually see) all 4 lines. You probably need a log scale since otherwise you will only be able to see the total which is very big and every other line will be overlapped.
Upvotes: 2
Reputation: 1721
The best way to solve this to regroup your data so that you have one column for the x axis and one for the y axis and one for what type of data that is contained in the row. To do this you can use the tidyr package.
library(tidyr)
plottingData <- df %>% gather(type,value,-cat_group)
ggplot(plottingData,aes(x=cat_group,y=value,color=type)) + geom_line()
Upvotes: 0