Reputation: 111
I am importing several .csv files of temperature registers from a folder, and need to create a single table mergind them all, in which the first 15 lines must be skipped and the file names must be kept. How to keep the file names?
Any advice is appreciated
Thanks
> setwd("")
>
> files <- list.files("folder",pattern = ".csv", recursive
> = T, full.names = T)
>
> data<- do.call(rbind, lapply
> (files, read.csv, as.is=T, skip = 15, header = TRUE))
With this code I get the table but do not know how to add a new variable with the file names
> data
time temp
1 2019-11-29 19:39:28 14.4
2 2019-11-29 20:09:28 14.4
3 2019-11-29 20:39:28 14.5
4 2019-11-29 21:09:28 14.5
Upvotes: 0
Views: 120
Reputation: 270045
If we use Map
rather than lapply
then the row names will contain the file paths followed by a sequence number. Add whatever args you need to read.csv
below but note that header=TRUE
is the default so it is not needed and as.is=TRUE
may not be needed either. No packages are used.
do.call("rbind", Map(\(x) read.csv(x), files))
For example, if we generate the test files as shown in the Note at the end then
files <- Sys.glob("folder/*.csv")
do.call("rbind", Map(read.csv, files))
## X Time demand
## folder/a.csv.1 1 1 8.3
## folder/a.csv.2 2 2 10.3
## folder/b.csv.1 3 3 19.0
## folder/b.csv.2 4 4 16.0
## folder/c.csv.1 5 5 15.6
## folder/c.csv.2 6 7 19.8
In a comment it was later mentioned that a column containing only the filename without folder or extension is desired. In that case it is easier to do it like this. Note that tools comes with R and so does not have to be installed.
library(tools)
Read <- \(x) cbind(read.csv(x), file = file_path_sans_ext(basename(x)))
do.call("rbind", lapply(files, Read))
## X Time demand file
## 1 1 1 8.3 a
## 2 2 2 10.3 a
## 3 3 3 19.0 b
## 4 4 4 16.0 b
## 5 5 5 15.6 c
## 6 6 7 19.8 c
dir.create("folder")
s <- split(BOD, rep(letters[1:3], each = 2))
junk <- Map(write.csv, s, paste0(file.path("folder", names(s)), ".csv"))
Upvotes: 2
Reputation: 7979
You might want something along the lines
files = list.files("folder", pattern = "\\.csv$", recursive = TRUE, full.names = TRUE)
# data0 =
lapply(files, \(i) { # or basename(files)?
read.csv(i, as.is = TRUE, skip = 15L, header = TRUE) |>
transform(name = i) # or sub(".csv", "", i) instead of i
}) |> do.call(what = "rbind")
Edit:
Based on your comment I suggest
# data0 =
lapply(basename(files), \(i) { # or files
read.csv(i, skip = 15L) |>
transform(source = sub(".csv", "", i)) # basename here?
}) |> do.call(what = "rbind")
Upvotes: 2