Reputation: 33
I have the following:
a_aa a_ab a_ac b_aa b_ab b_ac
2 3 3 3 1 2
3 4 1 1 3 1
Desired outcome:
a_aa b_aa a_ab b_ab a_ac b_ac
2 3 3 1 3 2
3 1 4 3 1 1
Code with data:
d <- "a_aa a_ab a_ac b_aa b_ab b_ac
2 3 3 3 1 2
3 4 1 1 3 1"
dd <- read.table(textConnection(object = d), header = T)
My current solution is manual:
dd %>% select(a_aa, b_aa, a_ab, b_ab, a_ac, b_ac)
however, is onerous when number of columns is large. Any ideas how to do this kind of column ordering with grouping (e.g. sequence a_etc1, b_etc1, a_etc2, b_etc2)? Thank you!
Upvotes: 3
Views: 412
Reputation: 5788
Base R solution:
dd[,order(gsub(".*\\_(\\w+$)", "\\1", names(dd)))]
Upvotes: 1
Reputation: 78917
Here is an alternative way:
We first create a function using two additional packages:
stringi
for its str_reverse
function (advantage:it is vectorized)gtools
for its mixedsort
functionThen we apply it to our dataframe df
with rename_with
:
library(dplyr)
# the function
my_function <- function(df){
x <- stringi::stri_reverse(colnames(df))
y <- gtools::mixedsort(x)
stringi::stri_reverse(y)
}
df %>%
rename_with(., ~my_function(df))
a_aa b_aa a_ab b_ab a_ac b_ac
1 2 3 3 3 1 2
2 3 4 1 1 3 1
Upvotes: 1
Reputation: 6489
Here is one way to solve your problem:
dd[order(gsub(".+_", "", names(dd)))]
# or
dd %>%
select(order(gsub(".+_", "", names(.))))
a_aa b_aa a_ab b_ab a_ac b_ac
1 2 3 3 1 3 2
2 3 1 4 3 1 1
Upvotes: 6
Reputation: 26218
You may do something like this
library(tidyverse)
d <- "a_aa a_ab a_ac b_aa b_ab b_ac
2 3 1 3 3 2
3 1 3 4 1 1"
dd <- read.table(textConnection(object = d), header = T)
colnames(dd) %>%
str_split("_") %>%
map_chr(~.x[2]) %>%
unique() -> vars
dd %>%
select(ends_with(all_of(vars)))
#> a_aa b_aa a_ab b_ab a_ac b_ac
#> 1 2 3 3 3 1 2
#> 2 3 4 1 1 3 1
If you don't want to use other tidyverse libraries than dplyr, you can do
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
d <- "a_aa a_ab a_ac b_aa b_ab b_ac
2 3 1 3 3 2
3 1 3 4 1 1"
dd <- read.table(textConnection(object = d), header = T)
colnames(dd) %>%
strsplit("_") %>%
sapply(\(.x) .x[2]) %>%
unique() -> vars
dd %>%
select(ends_with(all_of(vars)))
#> a_aa b_aa a_ab b_ab a_ac b_ac
#> 1 2 3 3 3 1 2
#> 2 3 4 1 1 3 1
Created on 2022-07-10 by the reprex package (v2.0.1)
Upvotes: 1