Ordering columns in data frame

Question

I have a data frame with the following column names:

well, DIV10SD7, DIV11SD7, DIV7SD7, DIV9SD7

However, I want the order to be the following:

well, DIV7SD7, DIV9SD7, DIV10SD7, DIV11SD7

So basically, I want to sort by the number after "DIV" and before "SD7". Additionally, I want to leave out the "well" column when I sort.

When I use the following command:

df[,order(names(df))]

The order of the data frame is unchanged, with the exception of the well column, which moves to the end. I believe this is because R reads each string one character at a time. So, in this case, all the numbers that begin with 1 (e.g. DIV10 and DIV11) are placed before DIV7 and DIV9.

Is there a way to change this behavior?

A5C1D2H2I1M1N2O1R2T1 · Accepted Answer

You can try the mixedorder function from the "gtools" package:

mydf[c(1, mixedorder(names(mydf)[-1]) + 1)]
##   well DIV7SD7 DIV9SD7 DIV10SD7 DIV11SD7
## 1    1       7       9        3        5
## 2    2       8      10        4        6

Sample data:

mydf <- structure(list(well = 1:2, DIV10SD7 = 3:4, DIV11SD7 = 5:6, DIV7SD7 = 7:8, 
        DIV9SD7 = 9:10), .Names = c("well", "DIV10SD7", "DIV11SD7", 
    "DIV7SD7", "DIV9SD7"), row.names = 1:2, class = "data.frame")

I'd also suggest converting your dataset to a data.table so that you can make use of the set functions in "data.table" (like setcolorder). This will let you update the column order by reference.

Ordering columns in data frame

Answers (1)

Related Questions