Laurent
Laurent

Reputation: 11

R script to open folders then identify a file, rename it, and read it

I have recently learned to code with R and I sort of manage to handle the data within files but I can't get it to manipulate the files themselves. Here is my problem:

I'd like to open successively, in my working directory "Laurent/R", the 3 folders that are within it ("gene_1", "gene_2", "gene_3").

In each folder, I want one specific .csv file (the one containing the specific word "Cq") to be renamed as "gene_x_Cq" (and then to move these 3 renamed files in a new folder (is that necessary?)).

I want then to be able to successively open these 3 .csv files (with read.csv i suppose) to manipulate the data within them. I've looked at different functions like list.file, unlist, file.rename but i'm sure they are appropriate and I can't figure out how to use them in my case. Can anyone help ? (I use a Mac) Thanks Laurent

Upvotes: 1

Views: 77

Answers (1)

Nova
Nova

Reputation: 5861

Here's a potential solution. If you don't understand something, just shout out and ask!

setwd("Your own file path/Laurent")
library(stringr)

# list all .csv files
csvfiles <- list.files(recursive = T, pattern = "\\.csv")
csvfiles

# Pick out files that have cq in them, ensuring that you ignore uppercase/lowercase
cq.files <- csvfiles[str_detect(csvfiles, fixed("cq", ignore_case = T))]

# Get gene number for both files - using "2" here because gene folder is at the second level in the file path
gene.nb <- str_sub(word(cq.files, 2, 2, sep = "/"), 6, 6)
gene.nb

# create a new folder to place new files into
dir.create("R/genefiles")

# This will copy files, not move them. To move them, use file.rename - but be careful, I'd try file.copy first.
cq.files <- file.copy(cq.files,
                        paste0("R/genefiles/gene_", gene.nb, "_", "Cq", ".csv"))

# Now to work with all files in the new folder
library(purrr)
genefiles <- list.files("R/genefiles", full.names = T)

# This will bring in all data into one dataframe. If you want them brought in as separate dataframes,
# use something like gene1 <- read.csv("R/genefiles/gene_1_Cq.csv")
files <- map_dfr(genefiles, read.csv)

Upvotes: 2

Related Questions