Nick Brown
Nick Brown

Reputation: 65

file.copy: overcoming Windows long path/filename limitation

I have a table in R containing many files that I need copied to a destination folder. The files are spread out over dozens of folders, each several sub-folders down. I have successfully used the following code to find all of the files and their locations:

(fastq_files <- list.files(Illumina_output, ".fastq.gz", recursive = TRUE, include.dirs = TRUE) %>% as_tibble)

After appending the full path, I have a tibble that looks something like this:

full_path
Q:/IlluminaOutput/2019/091119 AB NGS/Data/Intensities/BaseCalls/19-15897-HLA-091119-AB-NGS_S14_L001_R1_001.fastq.gz
Q:/IlluminaOutput/2019/091119 AB NGS/Data/Intensities/BaseCalls/19-15236-HLA-091119-AB-NGS_S14_L001_R2_001.fastq.gz
Q:/IlluminaOutput/2018/062818AB NGS/Data/Intensities/BaseCalls/18-06875-HLA-062818-NGS_S11_L001_R1_001.fastq.gz

Using the file.copy function gives an error that the file name is too long, a known issue in Windows (I am using RStudio on Windows 10).

I found that if I set the working directory directory to the file location, I am able to copy files. Starting with a table like this:

file path
19-14889-HLA-091119-AB-NGS_S14_L001_R1_001.fastq.gz Q:/IlluminaOutput/2019/091119 AB NGS/Data/Intensities/BaseCalls/
19-14889-HLA-091119-AB-NGS_S14_L001_R2_001.fastq.gz Q:/IlluminaOutput/2019/091119 AB NGS/Data/Intensities/BaseCalls/
18-09772-HLA-062818-NGS_S11_L001_R1_001.fastq.gz Q:/IlluminaOutput/2018/062818AB NGS/Data/Intensities/BaseCalls/
18-09772-HLA-062818-NGS_S11_L001_R2_001.fastq.gz Q:/IlluminaOutput/2018/062818AB NGS/Data/Intensities/BaseCalls/

I used the following code to sucsessfully copy the first file:

(dir <- as.character(as.vector(file_and_path[1,2])))
setwd(dir)
(file <- as.character(as.vector(file_and_path[1,1])))
(file.copy(file, Trusight_output) %>% as.tibble)

I got this to work, but I don't know how to apply these steps to every column in my table. I think i probably have to use the lapply function, but I'm not sure how to construct it.

Upvotes: 0

Views: 1479

Answers (1)

Mikael Jagan
Mikael Jagan

Reputation: 11326

This should do the trick, assuming that file_and_path$file and file_and_path$path are both character vectors and that Trusight_output is an absolute path:

f <- function(file, from, to) {
    cwd <- setwd(from)
    on.exit(setwd(cwd))
    file.copy(file, to)
}
Map(f, file = file_and_path$file, from = file_and_path$path, to = Trusight_output)

We use Map here rather than lapply because we are applying a function of more than one argument. FWIW, operations like this are often better suited for PowerShell.

Upvotes: 2

Related Questions