Reputation: 28339
My files are:
CT.BP.50.txt
CT.BP.200.txt
CT.BP.500.txt
GP.BP.50.txt
GP.BP.200.txt
GP.BP.500.txt
files <- c("CT.BP.50.txt", "CT.BP.200.txt", "CT.BP.500.txt", "GP.BP.50.txt", "GP.BP.200.txt", "GP.BP.500.txt")
I want to perform specific operation on them, I can do this:
for (i in 1:length(files)) {
foo <- read.table(files[i])
barplot(table(foo$V1), main = files[i])
}
But R plots them in this order:
"CT.BP.200.txt" "CT.BP.500.txt" "CT.BP.50.txt" "GP.BP.200.txt" "GP.BP.500.txt" "GP.BP.50.txt"
And I want them to be plotted in sorted order:
"CT.BP.50.txt" "CT.BP.200.txt" "CT.BP.500.txt" "GP.BP.50.txt" "GP.BP.200.txt" "GP.BP.500.txt"
How sort objects with alphanumeric names?
Upvotes: 4
Views: 2703
Reputation: 4795
It looks like you want to sort by particular components of your filename in a particular order.
So I would start by breaking the filename into its components with something like:
filesmat=matrix(unlist(strsplit(files,split='\\.')),byrow=T,ncol=4)
then extract columns that you want to sort by.
numbercomponent=as.numeric(filesmat[,3])
varname=filesmat[,1]
Then reorder the filenames with something like
files=files[order(varname,numbercomponent)]
Then just plot anyway you want.
Upvotes: 1
Reputation: 58835
The problem is that list.files()
returns the file names in standard (lexically) sorted order, and the digits are being compared position by position rather than as part of a number.
files <- sort(c("Gen.Var_CT.BP.200.txt", "Gen.Var_CT.BP.500.txt",
"Gen.Var_CT.BP.50.txt", "Gen.Var_GP.BP.200.txt",
"Gen.Var_GP.BP.500.txt", "Gen.Var_GP.BP.50.txt"))
On my system, this gives:
> files
[1] "Gen.Var_CT.BP.200.txt" "Gen.Var_CT.BP.50.txt" "Gen.Var_CT.BP.500.txt"
[4] "Gen.Var_GP.BP.200.txt" "Gen.Var_GP.BP.50.txt" "Gen.Var_GP.BP.500.txt"
The function gtools::mixedsort
will (in general) sort the way you want: series of digits in a string will be treated as numbers for sorting purposes. There is a bit of a snag with your example, though, because mixedsort
assumes .
are part of numbers and so sees .200.
as a potential number, which can't actually be sorted as a number. Since your examples don't have actual decimal points within them, you can get around this.
files <- files[mixedorder(gsub("\\.", " ", files))]
So files is now sorted as:
> files
[1] "Gen.Var_CT.BP.50.txt" "Gen.Var_CT.BP.200.txt" "Gen.Var_CT.BP.500.txt"
[4] "Gen.Var_GP.BP.50.txt" "Gen.Var_GP.BP.200.txt" "Gen.Var_GP.BP.500.txt"
Upvotes: 11
Reputation: 17517
Might this do it?
files <- c("Gen.Var_CT.BP.50.txt", "Gen.Var_CT.BP.200.txt", "Gen.Var_CT.BP.500.txt", "Gen.Var_GP.BP.50.txt", "Gen.Var_GP.BP.200.txt", "Gen.Var_GP.BP.500.txt"){
for (i in 1:length(files)) {
b <- read.table(files[i])
barplot(table(b$V1), main=files[i])
Upvotes: 2