Reputation: 335

reorder column names by last character

I have df with following colname:

colname(df) gives:

"SUBJID" "EoT_A"  "EoT_B"  "EoT_C"  "EoT_D"  "PR_A"   "PR_B"   "PR_C"   "PR_D"  
"PD_A"   "PD_B"   "PD_C"   "PD_D"   "CR_A"   "CR_B"   "CR_C"   "CR_D"

I would like to reorder colname like:

"SUBJID" 
"EoT_A" "PR_A" "PD_A" "CR_A"
"EoT_B" "PR_B" "PD_B" "CR_B"
"EoT_C" "PR_C" "PD_C" "CR_C"
"EoT_D" "PR_D" "PD_D" "CR_D"

would there be a smart way to achieve this?

Upvotes: 4

Answers (6)

Ronak Shah

Reputation: 389235

Similar to @Quinten's solution but without indexing the columns. Using sub in dplyr pipe.

df |> 
  dplyr::select(SUBJID, order(sub('.*_', '', names(df))))

# SUBJID EoT_A PR_A PD_A CR_A EoT_B PR_B PD_B CR_B EoT_C PR_C PD_C CR_C EoT_D PR_D PD_D CR_D
#1    NA    NA   NA   NA   NA    NA   NA   NA   NA    NA   NA   NA   NA    NA   NA   NA   NA

Upvotes: 0

Andre Wildberg

Reputation: 19191

An approach using matrix. It transposes the data by groups of 4 through byrow.

df[, sapply(c(colnames(df)[1], 
       as.vector(matrix(colnames(df)[-1], nrow=4, byrow=T))), function(x) 
  which(colnames(df) == x))]
  SUBJID EoT_A PR_A PD_A CR_A EoT_B PR_B PD_B CR_B EoT_C PR_C PD_C CR_C EoT_D
1      1     2    6   10   14     3    7   11   15     4    8   12   16     5
2      2     3    7   11   15     4    8   12   16     5    9   13   17     6
3      3     4    8   12   16     5    9   13   17     6   10   14   18     7
  PR_D PD_D CR_D
1    9   13   17
2   10   14   18
3   11   15   19

Data

df <- structure(list(SUBJID = 1:3, EoT_A = 2:4, EoT_B = 3:5, EoT_C = 4:6, 
    EoT_D = 5:7, PR_A = 6:8, PR_B = 7:9, PR_C = 8:10, PR_D = 9:11, 
    PD_A = 10:12, PD_B = 11:13, PD_C = 12:14, PD_D = 13:15, CR_A = 14:16, 
    CR_B = 15:17, CR_C = 16:18, CR_D = 17:19), class = "data.frame", 
row.names = c(NA, -3L))

Upvotes: 0

Quinten

Reputation: 41533

Another option using sub by extracting the last character after the last underscore and sort that alphabetically. To make sure the first column is not used you could add +1 to the sort to have it in the right order like this:

df[c(1, 1+order(sub('.*_', '', colnames(df[,-1]))))]
#>   SUBJID EoT_A PR_A PD_A CR_A EoT_B PR_B PD_B CR_B EoT_C PR_C PD_C CR_C EoT_D
#> 1      1     1    1    1    1     1    1    1    1     1    1    1    1     1
#>   PR_D PD_D CR_D
#> 1    1    1    1

^{Created on 2023-01-22 with reprex v2.0.2}

Upvotes: 2

jay.sf

Reputation: 73602

Assuming x are your colnames, you can order them by the nchar.

c(x[1], x[-1][order(substring(x[-1], nchar(x[-1])))])
# [1] "SUBJID" "EoT_A"  "PR_A"   "PD_A"   "CR_A"   "EoT_B" 
# [7] "PR_B"   "PD_B"   "CR_B"   "EoT_C"  "PR_C"   "PD_C"  
# [13] "CR_C"   "EoT_D"  "PR_D"   "PD_D"   "CR_D"

Upvotes: 1

Julian

Reputation: 9320

You could use dplyr::ends_with, e.g.

df |> 
  dplyr::select(SUBJID, dplyr::ends_with(LETTERS[1:4])) |> 
  colnames()

 [1] "SUBJID" "EoT_A"  "PR_A"   "PD_A"   "CR_A"   "EoT_B"  "PR_B"   "PD_B"  
 [9] "CR_B"   "EoT_C"  "PR_C"   "PD_C"   "CR_C"   "EoT_D"  "PR_D"   "PD_D"  
[17] "CR_D"

Upvotes: 9

Allan Cameron

Reputation: 174478

I don't know how smart it is, but you can do

df[c(1, order(sapply(strsplit(names(df), '_'), function(x) rev(x)[1])[-1]) + 1)]

for example, if your data frame looks like this:

df
#>   SUBJID EoT_A EoT_B EoT_C EoT_D PR_A PR_B PR_C PR_D PD_A PD_B PD_C PD_D CR_A CR_B CR_C CR_D
#> 1      1     2     3     4     5    6    7    8    9   10   11   12   13   14   15   16   17

Then the code puts your data into the required order:

df[c(1, order(sapply(strsplit(names(df), '_'), function(x) rev(x)[1])[-1]) + 1)]
#>   SUBJID EoT_A PR_A PD_A CR_A EoT_B PR_B PD_B CR_B EoT_C PR_C PD_C CR_C EoT_D PR_D PD_D CR_D
#> 1      1     2    6   10   14     3    7   11   15     4    8   12   16     5    9   13   17

Upvotes: 2

reorder column names by last character

Answers (6)

Data

Related Questions