John legend2
John legend2

Reputation: 920

smoothing line with categorical variable with ggplot?

I have a huge data sets and this is a sample.

data.frame(basket_size_group = c("[0,2]", "[0,2]", "(2,4]", "(2,4]", "(4,6]"),
       channel = c("offline", "online/mobile", "offline", "online/mobile", "offline"), 
       pct_trips = c(0.004, 0.038, 0.0028, 0.0082, 0.0037))

By using a ggplot2, I would like to plot smoothing line with the data. Xaxis is the basket_size_group, yaxis is pct_trips, channel is a group in ggplot2 . The problem is that basket_size_group is a categorical variable. How to create smoothing lines by channel with ggplot2?

Upvotes: 3

Views: 4486

Answers (1)

Nate
Nate

Reputation: 10671

If you want to use a loess smooth you will need some more data. As it sits stat_smooth() will fail with the error:

Computation failed in `stat_smooth()`:
NA/NaN/Inf in foreign function call (arg 5)

Unless you specify method = "lm".

You also have to be explicit with the stat_smooth() layer and define that group = channel. You could do that in the top layer too, but without it stat_smooth will try to use x and color to do its group summarizing.

# factor it to plot in order
dat$basket_size_group <- factor(dat$basket_size_group, levels = c("[0,2]", "(2,4]", "(4,6]"))

ggplot(dat, aes(basket_size_group, pct_trips, color = channel)) +
    geom_point() +
    stat_smooth(aes(group = channel), method = "lm")

enter image description here

Upvotes: 3

Related Questions