karips
karips

Reputation: 151

Strings in ggplot x-axis

I'm trying to create a graph in R like this: enter image description here

I have three columns (online, offline and routes). However, when I add the following code:

library(ggplot2)
ggplot(coefroute, aes(routes,offline)) + geom_line()

I get the following message:

geom_path: Each group consists of only one observation. Do you need to adjust the group aesthetic?

sample of coefroute:

routes  online  offline
(Intercept) 210.4372    257.215
route10 7.543   30.0182
route100    18.3794 1.5313
route11 38.6537 78.8655
route12 66.501  94.8838
route13 -22.2391    -25.8448
route14 24.3652 177.7728
route15 48.5464 51.126 ...

routes: char, online and offline: num

Can anybody help me with putting strings in x-axis in R? Thank you!

Upvotes: 1

Views: 2132

Answers (2)

Uwe
Uwe

Reputation: 42592

There are two approaches:

  • Plotting the data in wide format (quick & dirty, not recommended)
  • plotting the data after reshaping from wide to long format (as shown by dshkol but using a different approach.

Plotting the data in wide format

# using dshkol's toy data
coefroute <- data.frame(routes = c("A","B","C","D","E"),
                        online = c(21,26,30,15,20),
                        offline = c(15,20,7,12,15))
library(ggplot2)
# plotting data in wide format (not recommended)
ggplot(coefroute, aes(x = routes, group = 1L)) + 
  geom_line(aes(y = online), colour = "blue") +
  geom_line(aes(y = offline), colour = "orange")

enter image description here

This approach has several drawbacks. Each variable needs its own call to geom_line() and there is no legend.

Plotting reshaped data

For reshaping, the melt() is used which is available from the reshape2 package (the predecessor of the tidyr/dplyr packages) or in a faster implementation form the data.table package.

ggplot(data.table::melt(coefroute, id.var = "routes"), 
       aes(x = routes, y = value, group = variable, colour = variable)) + 
  geom_line()

enter image description here

Note that in both cases the group aesthetic has to be specified because the x-axis is discrete. This tells ggplot to consider the data points belonging to one series despite the discrete x values.

Upvotes: 1

dshkol
dshkol

Reputation: 1228

In the absence of sample data, here's some toy data that has the same structure as yours:

coefroute <- data.frame(routes = c("A","B","C","D","E"),
                    online = c(21,26,30,15,20),
                    offline = c(15,20,7,12,15))

To replicate your example graph in ggplot2 you would want your data in a long format, so that you can group on offline/online. See more here: Plotting multiple lines from a data frame with ggplot2 and http://ggplot2.tidyverse.org/reference/aes.html.

You can rearrange your data into a long format very easily with lots of different functions or packages, but a standard approach is to use gather from tidyr and group your series for online and offline into something called, say, status or whatever you want.

library(tidyr)
coefroute <- gather(coefroute, key = status, value = coef, online:offline)

Then you can plot this easily in ggplot:

library(ggplot2)
ggplot(coefroute, aes(x = routes, y = coef, group = status, colour = status))
 + geom_line() + scale_x_discrete()

That should create something like your example graph. You may want to modify the colours, captions, etc. There's lots of documentation about these things that's easy enough to find. I've added scale_x_discrete() here so that ggplot knows to treat your x variable as a discrete one.

Secondly, my suspicion is that a line plot may be less effective than geoms in communicating what you're trying to communicate here. I would perhaps use geom_bar(stat = "identity", position = "dodge") in place of geom_line. That would create a vertical bar chart for each coefficient with offline and online coefficients side by side.

ggplot(coefroute, aes(x = routes, y = coef, group = status, fill = status)) 
+ geom_bar(stat = "identity", position = "dodge") + scale_x_discrete()

Upvotes: 1

Related Questions