accibio
accibio

Reputation: 559

Sorting specific columns of a dataframe by their names in R

df is a test dataframe and I need to sort the last three columns in ascending order (without hardcoding the order).

df <- data.frame(X = c(1, 2, 3, 4, 5),
            Z = c(1, 2, 3, 4, 5),
            Y = c(1, 2, 3, 4, 5),
            A = c(1, 2, 3, 4, 5),
            C = c(1, 2, 3, 4, 5),
            B = c(1, 2, 3, 4, 5))

Desired output:

> df
  X Z Y A B C
1 1 1 1 1 1 1
2 2 2 2 2 2 2
3 3 3 3 3 3 3
4 4 4 4 4 4 4
5 5 5 5 5 5 5

I'm aware of the order() function but I can't seem to find the right way to implement it to get the desired output.

Upvotes: 1

Views: 1205

Answers (4)

IceCreamToucan
IceCreamToucan

Reputation: 28705

to_order <- seq(ncol(df)) > ncol(df) - 3

df[order(to_order*order(names(df)))]
#>   X Z Y A B C
#> 1 1 1 1 1 1 1
#> 2 2 2 2 2 2 2
#> 3 3 3 3 3 3 3
#> 4 4 4 4 4 4 4
#> 5 5 5 5 5 5 5

Created on 2021-12-24 by the reprex package (v2.0.1)

Upvotes: 1

TarJae
TarJae

Reputation: 79286

Update:

Base R:

cbind(df[1:3],df[4:6][,order(colnames(df[4:6]))])

First answer:

We could use relocate from dplyr: https://dplyr.tidyverse.org/reference/relocate.html

It is configured to arrange columns:

Here we relocate by the index. We take last (index = 6) and put it before (position 5, which is C)

library(dplyr)
df %>% 
  relocate(6, .before = 5)

An alternative:

library(dplyr)
df %>% 
  select(order(colnames(df))) %>% 
  relocate(4:6, .before = 1)
X Z Y A B C
1 1 1 1 1 1 1
2 2 2 2 2 2 2
3 3 3 3 3 3 3
4 4 4 4 4 4 4
5 5 5 5 5 5 5

Upvotes: 4

Len Greski
Len Greski

Reputation: 10875

We want to reorder the columns based on the column names, so if we use names(df) as the argument to order, we can reorder the data frame as follows.

The complicating factor is that order() returns a vector of numbers, so if we want to reorder only a subset of the column names, we'll need an approach that retains the original sort order for the first three columns.

We accomplish this by creating a vector of the first 3 column names, the sorted remaining column names using a function that returns the values rather than locations in the vector, and then use this with the [ form of the extract operator.

df <- data.frame(X = c(1, 2, 3, 4, 5),
                 Z = c(1, 2, 3, 4, 5),
                 Y = c(1, 2, 3, 4, 5),
                 A = c(1, 2, 3, 4, 5),
                 C = c(1, 2, 3, 4, 5),
                 B = c(1, 2, 3, 4, 5))

df[,c(names(df[1:3]),sort(names(df[4:6])))]

...and the output:

> df[,c(names(df[1:3]),sort(names(df[4:6])))]
  X Z Y A B C
1 1 1 1 1 1 1
2 2 2 2 2 2 2
3 3 3 3 3 3 3
4 4 4 4 4 4 4
5 5 5 5 5 5 5

Upvotes: 1

Clemsang
Clemsang

Reputation: 5491

In base R, a selection on the first columns then sort the last 3 names :

df[, c(names(df)[1:(ncol(df)-3)], sort(names(df)[ncol(df)-2:0]))]

Upvotes: 2

Related Questions