Gabriel A. C. Gama
Gabriel A. C. Gama

Reputation: 17

How to delete part of a filename for multiple files using R

I'm using R studio 3.4.4 on Windows

I have file names that are like this:

1-2.edit-sites.annot.xlsx
2-1.edit-sites.annot.xlsx
...
10-1.edit-sites.annot.xlsx

I'm using the following code to rename the files (in order, including 1,2,3...10,11, etc)

file.rename(mixedsort(sort(list.files(pattern="*edit-sites.annot.xlsx"))), paste0("Sample ", 1:30))

But I can never remove the edit-sites.annot part. It seems like the . before edit is making R mess up with the extension of the file! When I use the code above, I get Sample 1.edit-sites.annot.xlsx but I would like Sample 1.xlsx

Upvotes: 0

Views: 1328

Answers (1)

Gautam
Gautam

Reputation: 2753

This ought to work. For each file that is named "[num1]-[num2].edit-sites.annot.xlsx" it will rename it to "Sample [num1]-[num2].xlsx":

fnames <- dir(path = choose.dir(), pattern = '.*edit-sites.annot.xlsx')

> fnames
[1] "1-2.edit-sites.annot.xlsx"  "10-2.edit-sites.annot.xlsx" "11-2.edit-sites.annot.xlsx" "21-2.edit-sites.annot.xlsx"
[5] "31-2.edit-sites.annot.xlsx" "4-2.edit-sites.annot.xlsx" 

ptrn <- '^([[:digit:]]{1,3}-[[:digit:]]{1,3}).*xlsx'
ptrn2 <- '(.*).edit-sites.annot.xlsx'

lapply(fnames, function(z){
  suffix <- regmatches(x = z, m = regexec(pattern = ptrn, text = z))[[1]][2] # ptrn2 also works 
  file.rename(from = z, to = paste('Sample', " ", suffix, ".xlsx", sep = ""))
})

After running code:

> dir(pattern = '.*xlsx')
[1] "Sample  1-2.xlsx"  "Sample  10-2.xlsx" "Sample  11-2.xlsx" "Sample  21-2.xlsx" "Sample  31-2.xlsx"
[6] "Sample  4-2.xlsx"

ptrn and ptrn2 are based on the sample filenames that you supplied. Since I wasn't sure if the file names are always consistent with the pattern provided, I included a regex pattern to match for the leading digits separated by a dash.

I tend to batch rename files (same as the example here) and include some print statements in my code (using cat) to show what the original name was and what it was changed to:

lapply(fnames, function(z){
  suffix <- regmatches(x = z, 
                       m = regexec(pattern = ptrn, text = z))[[1]][2] # ptrn2 also works 
  new_name <- paste('Sample', " ", suffix, ".xlsx")
  cat('Original file name: ')
  cat(z)
  cat("\n")
  cat('Renamed to: ')
  cat(new_name)
  cat("\n")

  file.rename(from = z, to = new_name)
})

Upvotes: 1

Related Questions