Reputation: 1907
I have the following data frame that with genome coordinates that I wish to sort by the first and second columns both in an increasing manner:
chr4 085976379 111570775
chr1 004336501 141626155
chr10 003051921 081538660
...
My code:
dat[order(dat[,1], dat[,2]), ]
I get:
chr1 004336501 141626155
chr10 003051921 081538660
chr4 085976379 111570775
However, I would like to have:
chr1 004336501 141626155
chr4 085976379 111570775
chr10 003051921 081538660
I could remove "chr", resort, then add "chr", but I was wondering if there were a more "elegant" one liner?
Thank you!
Upvotes: 2
Views: 51
Reputation: 193517
You can also try mixedorder
from the "gtools" package:
library(gtools)
mixedorder(mydf$V1)
# [1] 2 1 3
mydf[order(mixedorder(mydf$V1), mydf$V2), ]
# V1 V2 V3
# 2 chr1 4336501 141626155
# 1 chr4 85976379 111570775
# 3 chr10 3051921 81538660
Upvotes: 0
Reputation: 11597
You could try:
dat[order(as.numeric(gsub("chr", "",dat[,1])), dat[,2]), ]
V1 V2 V3
2 chr1 4336501 141626155
1 chr4 85976379 111570775
3 chr10 3051921 81538660
Upvotes: 1