Reputation: 572
I have one short question, how can I reorder a dataframe only for selected columnnames. I would need a universal solution here because I have to use it on changing amounts of V columns (everytime with V columns >100)
Example:
Consider I have the data:
dkk <- structure(list(A = 2L, X = 3L, C = 4L, D = 5L, Z = 6L, V1 = 5L,
V6 = 5L, V4 = 5L, V5 = 5L, V3 = 2L, V2 = 2L), .Names = c("X",
"B", "C", "D", "Z", "V1", "V6", "V4", "V5", "V3", "V2"),
class = "data.frame", row.names = c(NA, -1L))
# X B C D Z V1 V6 V4 V5 V3 V2
2 3 4 5 6 5 5 5 5 2 2
How can I reorder the columns with a V so that they are in a ascending order:
# X B C D Z V1 V2 V3 V4 V5 V6
2 3 4 5 6 5 2 2 5 5 5
Many thanks!!
Upvotes: 3
Views: 182
Reputation: 887088
Here is a faster option with setcolorder
from data.table
library(data.table)
i1 <- grep("V\\d+", names(dkk), value = TRUE)
cbind(dkk[setdiff(names(dkk), i1)], setcolorder(dkk[i1], order(i1))[])
# A B C D Z V1 V2 V3 V4 V5 V6
#1 2 3 4 5 6 5 2 2 5 5 5
This becomes a bit complicated when the 'V' names are intermingled with other columns, for example suppose, we change the column names to
set.seed(24)
names(dkk) <- sample(names(dkk))
dkk
# D C V6 Z V4 V1 B V2 V3 A V5
#1 2 3 4 5 6 5 5 5 5 2 2
Now, the option is to create a numeric index of those columns with 'V' ('i2'), extract the names ('i3') and assign the order of names and columns separately
i2 <- grep("^V\\d+", names(dkk))
i3 <- names(dkk)[i2]
names(dkk)[i2] <- sort(names(dkk)[i2])
dkk[i2] <- dkk[i2][order(i3)]
to get
dkk
# D C V1 Z V2 V3 B V4 V5 A V6
#1 2 3 5 5 5 5 5 6 2 2 4
There was one glitch in the above solution. It doesn't do the sort
ing correctly when we have column names with numbers greater than 9 i.e. 'V10', 'V11', etc. Suppose, our third column name is 'V100'
colnames(dkk)[3] <- "V100"
dkk
# D C V100 Z V4 V1 B V2 V3 A V5
#1 2 3 4 5 6 5 5 5 5 2 2
i2 <- grep("^V\\d+", names(dkk))
i3 <- names(dkk)[i2]
We can parse the number part with parse_number
to assist in ordering
i4 <- readr::parse_number(i3)
names(dkk)[i2] <- i3[order(i4)]
dkk[i2] <- dkk[i2][order(i4)]
dkk
# D C V1 Z V2 V3 B V4 V5 A V100
#1 2 3 5 5 5 5 5 6 2 2 4
dkk <- structure(list(A = 2L, B = 3L, C = 4L, D = 5L, E = 6L, V1 = 5L,
V6 = 5L, V4 = 5L, V5 = 5L, V3 = 2L, V2 = 2L), .Names = c("A",
"B", "C", "D", "Z", "V1", "V6", "V4", "V5", "V3", "V2"),
class = "data.frame", row.names = c(NA, -1L))
Upvotes: 4
Reputation: 16277
You can try this with order
on colnames:
dkk[,order(colnames(dkk))]
A B C D E V1 V2 V3 V4 V5 V6
2 3 4 5 6 5 2 2 5 5 5
EDIT To order only columns that contain "V". Note: I included a Z column in the data set. Basically, I c
the columns names that do not need to be sorted to the "V" columns that are sorted.
dkk <- structure(list(A = 2L, B = 3L, C = 4L, D = 5L, E = 6L, V1 = 5L,
V6 = 5L, V4 = 5L, V5 = 5L, V3 = 2L, V2 = 2L), .Names = c("A",
"B", "C", "D", "Z", "V1", "V6", "V4", "V5", "V3", "V2"),
class = "data.frame", row.names = c(NA, -1L))
cols <- c(colnames(dkk)[!grepl("V",names(dkk))],
colnames(dkk)[grepl("V",names(dkk))][order(colnames(dkk)[grepl("V",names(dkk))])])
dkk[,cols]
A B C D Z V1 V2 V3 V4 V5 V6
1 2 3 4 5 6 5 2 2 5 5 5
Upvotes: 2