Reputation: 65
I have a table in R containing many files that I need copied to a destination folder. The files are spread out over dozens of folders, each several sub-folders down. I have successfully used the following code to find all of the files and their locations:
(fastq_files <- list.files(Illumina_output, ".fastq.gz", recursive = TRUE, include.dirs = TRUE) %>% as_tibble)
After appending the full path, I have a tibble that looks something like this:
full_path |
---|
Q:/IlluminaOutput/2019/091119 AB NGS/Data/Intensities/BaseCalls/19-15897-HLA-091119-AB-NGS_S14_L001_R1_001.fastq.gz |
Q:/IlluminaOutput/2019/091119 AB NGS/Data/Intensities/BaseCalls/19-15236-HLA-091119-AB-NGS_S14_L001_R2_001.fastq.gz |
Q:/IlluminaOutput/2018/062818AB NGS/Data/Intensities/BaseCalls/18-06875-HLA-062818-NGS_S11_L001_R1_001.fastq.gz |
Using the file.copy
function gives an error that the file name is too long, a known issue in Windows (I am using RStudio on Windows 10).
I found that if I set the working directory directory to the file location, I am able to copy files. Starting with a table like this:
file | path |
---|---|
19-14889-HLA-091119-AB-NGS_S14_L001_R1_001.fastq.gz | Q:/IlluminaOutput/2019/091119 AB NGS/Data/Intensities/BaseCalls/ |
19-14889-HLA-091119-AB-NGS_S14_L001_R2_001.fastq.gz | Q:/IlluminaOutput/2019/091119 AB NGS/Data/Intensities/BaseCalls/ |
18-09772-HLA-062818-NGS_S11_L001_R1_001.fastq.gz | Q:/IlluminaOutput/2018/062818AB NGS/Data/Intensities/BaseCalls/ |
18-09772-HLA-062818-NGS_S11_L001_R2_001.fastq.gz | Q:/IlluminaOutput/2018/062818AB NGS/Data/Intensities/BaseCalls/ |
I used the following code to sucsessfully copy the first file:
(dir <- as.character(as.vector(file_and_path[1,2])))
setwd(dir)
(file <- as.character(as.vector(file_and_path[1,1])))
(file.copy(file, Trusight_output) %>% as.tibble)
I got this to work, but I don't know how to apply these steps to every column in my table. I think i probably have to use the lapply
function, but I'm not sure how to construct it.
Upvotes: 0
Views: 1479
Reputation: 11326
This should do the trick, assuming that file_and_path$file
and file_and_path$path
are both character vectors and that Trusight_output
is an absolute path:
f <- function(file, from, to) {
cwd <- setwd(from)
on.exit(setwd(cwd))
file.copy(file, to)
}
Map(f, file = file_and_path$file, from = file_and_path$path, to = Trusight_output)
We use Map
here rather than lapply
because we are applying a function of more than one argument. FWIW, operations like this are often better suited for PowerShell.
Upvotes: 2