Reputation: 1644
I would like to sort a df so that the rows appear in the order of a vector. I tried this here but it returns the df with rows relabelled exactly as in the vector rather than the whole df simply re-ordered.
My df is like:
> head(df)
POSITION MEANDEPTH CHROM
1 0:10000 0 chr1
2 10000:20000 0 chr1
3 20000:30000 0 chr1
4 30000:40000 0 chr1
5 40000:50000 0 chr1
6 50000:60000 0 chr1
> tail(df)
POSITION MEANDEPTH CHROM
308834 57170000:57180000 0 chrY
308835 57180000:57190000 0 chrY
308836 57190000:57200000 0 chrY
308837 57200000:57210000 0 chrY
308838 57210000:57220000 0 chrY
308839 57220000:57230000 0 chrY
> levels(df$CHROM)
[1] "chr1" "chr10" "chr11" "chr12" "chr13" "chr14" "chr15" "chr16" "chr17" "chr18" "chr19" "chr2" "chr20" "chr21" "chr22" "chr3" "chr4"
[18] "chr5" "chr6" "chr7" "chr8" "chr9" "chrM" "chrX" "chrY"
I would like to re-order the df according to df$CHROM so that the rows are in the following order:
# RE_ORDER CHROMS
chrom_order <- c('chr1','chr2','chr3','chr4','chr5','chr6','chr7','chr8','chr9','chr10','chr11',
'chr12','chr13','chr14','chr15','chr16','chr17','chr18','chr19','chr20','chr21','chr22','chrX','chrM')
I have tried:
df <- df[match(chrom_order, df$CHROM),]
but the rows were reordred as follows:
> head(df)
POSITION MEANDEPTH CHROM
1 0:10000 0 chr1
128716 0:10000 0 chr2
169134 0:10000 0 chr3
188964 0:10000 0 chr4
207986 0:10000 0 chr5
226140 0:10000 0 chr6
I'm trying to make the df so that chr1 all appear together, then chr2, chr3 etc as in the vector 'chrom_order'.
I also tried:
library(dplyr)
df %>%
slice(match(CHROM, chrom_order))
But this didnt work either. I thought about subsetting loads of times with different values of df$CHROM then re-joining the dfs in the order I want but it seems a bit long winded an inefficient. I'm sure there is a quick fix?
Upvotes: 1
Views: 73
Reputation: 146110
Just set the order of the levels:
df$CHROM = factor(df$CHROM, levels = chrom_order)
Then you can order your data frame on this column (the order of the levels is part of the factor now)
df[order(df$CHROM, df$POSITION), ]
Side note: not sure if you manually typed the order you want. If so, you might want to do something like this in the future:
chrom_order = c(paste0("chr", 1:22), "chrX", "chrM")
Upvotes: 3