Marta Román
Marta Román

Reputation: 111

Keep file names merging list with lapply?

I am importing several .csv files of temperature registers from a folder, and need to create a single table mergind them all, in which the first 15 lines must be skipped and the file names must be kept. How to keep the file names?

Any advice is appreciated

Thanks

> setwd("")
> 
> files <- list.files("folder",pattern = ".csv", recursive
> = T, full.names = T)
> 
> data<- do.call(rbind, lapply
>                 (files, read.csv, as.is=T, skip = 15, header = TRUE))

With this code I get the table but do not know how to add a new variable with the file names

> data
                   time temp
1   2019-11-29 19:39:28 14.4
2   2019-11-29 20:09:28 14.4
3   2019-11-29 20:39:28 14.5
4   2019-11-29 21:09:28 14.5

Upvotes: 0

Views: 120

Answers (2)

G. Grothendieck
G. Grothendieck

Reputation: 270045

If we use Map rather than lapply then the row names will contain the file paths followed by a sequence number. Add whatever args you need to read.csv below but note that header=TRUE is the default so it is not needed and as.is=TRUE may not be needed either. No packages are used.

do.call("rbind", Map(\(x) read.csv(x), files))

For example, if we generate the test files as shown in the Note at the end then

files <- Sys.glob("folder/*.csv")
do.call("rbind", Map(read.csv, files))
##                X Time demand
## folder/a.csv.1 1    1    8.3
## folder/a.csv.2 2    2   10.3
## folder/b.csv.1 3    3   19.0
## folder/b.csv.2 4    4   16.0
## folder/c.csv.1 5    5   15.6
## folder/c.csv.2 6    7   19.8

Added

In a comment it was later mentioned that a column containing only the filename without folder or extension is desired. In that case it is easier to do it like this. Note that tools comes with R and so does not have to be installed.

library(tools)

Read <- \(x) cbind(read.csv(x), file = file_path_sans_ext(basename(x)))
do.call("rbind", lapply(files, Read))
##   X Time demand file
## 1 1    1    8.3    a
## 2 2    2   10.3    a
## 3 3    3   19.0    b
## 4 4    4   16.0    b
## 5 5    5   15.6    c
## 6 6    7   19.8    c

Note

dir.create("folder")
s <- split(BOD, rep(letters[1:3], each = 2))
junk <- Map(write.csv, s, paste0(file.path("folder", names(s)), ".csv"))

Upvotes: 2

Friede
Friede

Reputation: 7979

You might want something along the lines

files = list.files("folder", pattern = "\\.csv$", recursive = TRUE, full.names = TRUE)
# data0 = 
lapply(files, \(i) { # or basename(files)?
  read.csv(i, as.is = TRUE, skip = 15L, header = TRUE) |>
    transform(name = i) # or sub(".csv", "", i) instead of i
  }) |> do.call(what = "rbind") 

Edit:

Based on your comment I suggest

# data0 = 
lapply(basename(files), \(i) { # or files
  read.csv(i, skip = 15L) |>
    transform(source = sub(".csv", "", i)) # basename here?
  }) |> do.call(what = "rbind") 

Upvotes: 2

Related Questions