Reputation: 132
i am currently trying to make a beautiful geom_col plot on a huge sample size. The names of the samples (which should be on the x-axis) are both numeric and characters, since i include "N" for negative control.
sample_names <- c(100,22,4,5,6,"N")
size <- c(3,2,3,4,2,3)
Now i would like to have that on in a beautiful order ranging from the lowest sample_name (meaning starting with sample number 4, then sample number 5, then sample number 6, sample number 22, sample number 100) to the highest and ending with the N. Since the values in the colum are identified as characters it always starts with sample 100 (because 1-0-0 is before 2-2).
d <- data.frame(sample_names,size) %>%
arrange(a)
This leads me to the problem that the data in the plot is ordered in a not that nice way.
It would be more pleasing to have in in the ascending order with the N at the end.
I already tried to transform this colum into a numeric and replace the resultig NA (which come in place of the "N") with a 0.
The issue with that is, that the plot includes huge gaps between the samples:
d <- data.frame(sample_names,size) %>%
arrange(a) %>%
mutate(sample_names = as.numeric(sample_names))%>%
replace_na(list(sample_names = 0))
So my question is: Do you know how either sort a character colum into the "correct" ascending way OR do you know how to close the gaps on the x-axis in ggplot2? Thank you
Upvotes: 1
Views: 378
Reputation: 389135
Order of bars are controlled by factors in the data. To automate the factor generation code you can extract the values which are only numbers with regex, change them to numeric, sort them and append the non-numeric values at the end.
num <- grep('^\\d+$', d$sample_names)
d$sample_names <- factor(d$sample_names,
c(sort(unique(as.numeric(d$sample_names[num]))),
unique(d$sample_names[-num])))
library(ggplot2)
ggplot(d, aes(sample_names, size)) + geom_col()
A simpler approach as suggested by @Rui Barradas is to use stringr::str_sort
or gtools::mixedsort
-
d$sample_names <- factor(d$sample_names, stringr::str_sort(unique(d$sample_names), numeric = TRUE))
d$sample_names <- factor(d$sample_names, gtools::mixedsort(unique(d$sample_names)))
Upvotes: 2