Reputation: 123
list.files() can help find files in a directory, but how can I loop through a list of files already in text file? Where all_my_files.txt lists the path to each file one per row:
file.txt
file2.txt
file3.txt
library(data.table)
files<- read.csv(all_my_files.txt)
for (i in 1:length(files))
{
df<-fread(files[i])
x<-mean(df$V1)
}
Upvotes: 0
Views: 2326
Reputation: 1297
You can use lapply
to loop through your file names.
I use iris
like @bs93 but split into 3 separate data.frames.
iris1=iris[1:50,]
iris2=iris[51:100,]
iris3=iris[101:150,]
# write them to text files
write.table(iris2,file="iris2.txt",row.names=FALSE)
write.table(iris3,file="iris3.txt",row.names=FALSE)
write.table(iris1,file="iris1.txt",row.names=FALSE)
# create the text file containing the filenames
filenames <- paste0("iris", 1:3, ".txt")
writeLines(filenames,"filenames.txt")
# Now solve the problem
# read the filenames into a character vector
fn <- readLines("filenames.txt")
# apply `read.table` over that vector of filenames
Ilist <- lapply(fn,read.table,header=TRUE)
# Ilist is a list containing 3 data.frames
str(Ilist)
# Get the mean Sepal.Length from each data.frame in Ilist
x <- sapply(Ilist,function(z) mean(z$Sepal.Length))
x
# if you want to use `data.table`
library(data.table)
# then you can use `fread` instead of `read.table`
Ilist <- lapply(fn,fread)
# Then Ilist will be a list of 3 data.tables
Upvotes: 1
Reputation: 1316
Here is a small example and to make it reproducible we will use the built-in iris data set and save it 3 times to our working directory with filenames 'iris1.csv', 'iris2.csv', and 'iris3.csv'. Additionally, we can also save the relative paths to the file as well to a .txt file called 'all_my_files.txt' (also just 'iris1.csv', 'iris2.csv', and 'iris3.csv'). We can then read the file paths back in from the 'all_my_files.txt' and subsequently read the data associated with them.
data.table + loop solution
library(data.table)
library(tidyverse)
#make filenames
filenames <- paste0("iris", 1:3, ".csv")
#save iris dataset three time naming them 'iris1.csv', 'iris2.csv' etc
walk(filenames, ~write_csv(iris, path = .x))
#save the filepath
writeLines(filenames, "all_my_files.txt")
#read all the filepaths back in from text file
get_filenames_from_file <- readLines("all_my_files.txt")
files <- list()
mean_v1 <- vector()
for (i in 1:length(get_filenames_from_file)){
dat <-fread(get_filenames_from_file[[i]])
files[[i]] <- dat
#get mean of a column
mean_v1[i] <- mean(dat$Sepal.Length)
}
Full tidyverse solution:
library(tidyverse)
#make filenames
filenames <- paste0("iris", 1:3, ".csv")
#save iris dataset three time naming them 'iris1.csv', 'iris2.csv' etc
walk(filenames, ~write_csv(iris, path = .x))
#save the filepath
writeLines(filenames, "all_my_files.txt")
#read all the filepaths back in from text file
get_filenames_from_file <- readLines("all_my_files.txt")
#read the data in from the filepaths
data <- map(get_filenames_from_file, read_csv)
Either case we know have a list of 3 iris data frames:
str(data)
List of 3
$ : tibble [150 × 5] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
..$ Sepal.Length: num [1:150] 5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
..$ Sepal.Width : num [1:150] 3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
..$ Petal.Length: num [1:150] 1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
..$ Petal.Width : num [1:150] 0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
..$ Species : chr [1:150] "setosa" "setosa" "setosa" "setosa" ...
..- attr(*, "spec")=
.. .. cols(
.. .. Sepal.Length = col_double(),
.. .. Sepal.Width = col_double(),
.. .. Petal.Length = col_double(),
.. .. Petal.Width = col_double(),
.. .. Species = col_character()
.. .. )
$ : tibble [150 × 5] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
..$ Sepal.Length: num [1:150] 5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
..$ Sepal.Width : num [1:150] 3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
..$ Petal.Length: num [1:150] 1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
..$ Petal.Width : num [1:150] 0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
..$ Species : chr [1:150] "setosa" "setosa" "setosa" "setosa" ...
..- attr(*, "spec")=
.. .. cols(
.. .. Sepal.Length = col_double(),
.. .. Sepal.Width = col_double(),
.. .. Petal.Length = col_double(),
.. .. Petal.Width = col_double(),
.. .. Species = col_character()
.. .. )
$ : tibble [150 × 5] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
..$ Sepal.Length: num [1:150] 5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
..$ Sepal.Width : num [1:150] 3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
..$ Petal.Length: num [1:150] 1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
..$ Petal.Width : num [1:150] 0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
..$ Species : chr [1:150] "setosa" "setosa" "setosa" "setosa" ...
..- attr(*, "spec")=
.. .. cols(
.. .. Sepal.Length = col_double(),
.. .. Sepal.Width = col_double(),
.. .. Petal.Length = col_double(),
.. .. Petal.Width = col_double(),
.. .. Species = col_character()
.. .. )
Upvotes: 0