SriniShine
SriniShine

Reputation: 1139

R: Sorting a data frame by an alphanumeric

I have a data frame which stores a count value for each model. Model name is an alphanumeric. Then I generate a bar plot using ggplot2 having the models in the x axis and the count in the y axis. I want to order my x axis. The x axis appears as follows in the data frame and in the x axis in the plot. I want to sort it properly for example, M_1, M_2, M_3, M_10, M_11, M_20 etc

Model   Count
M_1 73
M_10    71
M_100   65
M_11    65
M_110   64
M_111   71
M_13    70
M_130   73
M_2 72
M_20    69
M_200   63
M_21    72
M_210   72
M_211   67
M_3 78
M_30    76
M_300   59
M_31    73
M_310   64

I tried using order(), mixedsort(), arrange() to order the dataframe first and factor() in ggplot2. However was not successful.

geneDFColSum[with(geneDFColSum, order(geneDFColSum$Model)), ]

geneDFColSum[with(geneDFColSum, mixedsort(geneDFColSum$Model)), ]

library(dplyr)
  arrange(geneDFColSum, Model)

Is there a way to achieve this? I could separate the model number into a separate column and order by that column. However looking whether there is an easy way.

Upvotes: 5

Views: 2616

Answers (2)

neilfws
neilfws

Reputation: 33782

Here's a solution based on your idea "separate the model number into a separate column and order by that column". You can then use that to reorder the factor levels.

library(tidyverse)

geneDFColSum %>% 
  mutate(Order = as.numeric(gsub("M_", "", Model))) %>% 
  arrange(Order) %>% 
  mutate(Model = factor(Model, levels = Model)) %>%
  ggplot(aes(Model, Count)) + 
    geom_col()

enter image description here

Upvotes: 2

Gregor Thomas
Gregor Thomas

Reputation: 145815

You need to order the levels of the factor, not the rows of the data:

dd$Model = factor(dd$Model, levels = gtools::mixedsort(dd$Model))
ggplot(dd, aes(x = Model, y = Count)) + geom_col()

enter image description here


Using this as input data:

dd = read.table(text = "Model   Count
M_1 73
M_10    71
M_100   65
M_11    65
M_110   64
M_111   71
M_13    70
M_130   73
M_2 72
M_20    69
M_200   63
M_21    72
M_210   72
M_211   67
M_3 78
M_30    76
M_300   59
M_31    73
M_310   64", header = T, stringsAsFactors = FALSE)

Upvotes: 4

Related Questions