spindoctor
spindoctor

Reputation: 1897

Ordering alphanumeric variables for plotting

How to I order a set of variable names along the x-axis that contain letters and numbers? So these come from a survey where the variables are formatted like var1, below. But when plotted, they appear out_1, out_10, out_11...

But what I would like is for it to be plotted out_1, out_2...

library(tidyverse)
var1<-rep(paste0('out','_', seq(1,12,1)), 100)
var2<-rnorm(n=length(var1) ,mean=2)
df<-data.frame(var1, var2)
ggplot(df, aes(x=var1, y=var2))+geom_boxplot()

I tried this:

df %>% 
separate(var1, into=c('A', 'B'), sep='_') %>% 
arrange(B) %>%  
ggplot(., aes(x=B, y=var2))+geom_boxplot()

Upvotes: 2

Views: 776

Answers (1)

Carlos Eduardo Lagosta
Carlos Eduardo Lagosta

Reputation: 1001

You can order the levels of var1 before plotting:

levels(df$var1) <- unique(df$var1)
ggplot(df, aes(var1,var2)) + geom_boxplot()

Or you can specify the order in ggplot scale options:

ggplot(df, aes(var1,var2)) +
  geom_boxplot() +
  scale_x_discrete(labels = unique(df$var1))

Both cases will give the same result:

enter image description here

You can also use it to give personalized labels; there's no need to create a new variable:

ggplot(df, aes(var1, var2)) +
  geom_boxplot() +
  scale_x_discrete('output', labels = gsub('out_', '', unique(df$var1)))

enter image description here

Check ?discrete_scale for details. You can use breaks and labels in different combinations, including the use of labels that came from outside your data.frame:

pers.labels <- paste('Output', 1:12)

ggplot(df, aes(var1, var2)) +
  geom_boxplot() +
  scale_x_discrete(NULL, labels = pers.labels)

enter image description here

Upvotes: 2

Related Questions