Geekuna Matata
Geekuna Matata

Reputation: 1439

How to sort alphabetically in R?

My files are of the format:

ada1
ada2
ada3
....
ada10
ada11
ada12

Unfortunately, when I write out a10,a11 and a12 comes before a2. Could you help me sort it alphabetically as it should be?

#

Edit

Now, I have thousands of these files. Basically, xyz1-12, abc1-12 etc.

I use the following to get all files:

GG <- grep("*.txt", list.files(), value = TRUE)

So I can't put 'ada' manually.

Upvotes: 3

Views: 3445

Answers (4)

dayne
dayne

Reputation: 7784

If you can change the file names you could do something like the following:

names0 <- paste0("a", 1:20)
temp <- strsplit(names0, "a")
ind <- sapply(temp, "[[", 2)
names1 <- paste0("a", sprintf("%03d", as.numeric(ind)))

> names1
[1] "a001" "a002" "a003" "a004" "a005" "a006"
[7] "a007" "a008" "a009" "a010" "a011" "a012"
[13] "a013" "a014" "a015" "a016" "a017" "a018"
[19] "a019" "a020"

You may have to tweak the call to sprintf, based on this answer.

Just to clarify, using file.rename, it would be pretty easy to rename all your files.

Upvotes: 0

Matthew Lundberg
Matthew Lundberg

Reputation: 42649

If there are always three characters, you can sort independently by those characters, followed by a numeric sort of the rest of the string:

GG <- paste0(c('ada', 'xyz'), 1:20) # Synthesis of data similar to what your command would give

Using order with multiple arguments gives the permutation of the vector, then indexing by that permutation returns the data in the desired sort order:

GG[order(substring(GG, 1, 3), as.numeric(substring(GG, 4)))]
 [1] "ada1"  "ada3"  "ada5"  "ada7"  "ada9"  "ada11" "ada13" "ada15" "ada17" "ada19" "xyz2"  "xyz4"  "xyz6"  "xyz8"  "xyz10"
[16] "xyz12" "xyz14" "xyz16" "xyz18" "xyz20"

Upvotes: 3

ilir
ilir

Reputation: 3224

If you can't change their names to something better (that is ada001, ada002...) then you could create an double index. I am assuming fnames is a vector with the file names, and the numbers are only preceded by a fixed number of letters.

alpha <- substr(fnames, 1, 3)
num <- as.integer(substr(fnames, 4, nchar(fnames)))

o <- order(alpha, num)   ## that's your sorting vector

You can modify this procedure to accommodate a varying number of letters using regular expressions to find the split.

Upvotes: 1

James King
James King

Reputation: 6365

Another way using package gtools:

require(gtools)
x <- paste0('a', 1:12)
mixedsort(x)
[1] "a1"  "a2"  "a3"  "a4"  "a5"  "a6"  "a7"  "a8"  "a9"  "a10" "a11" "a12"

Upvotes: 1

Related Questions