Reputation: 296
I have several in CSV format and I need to import them and transform them into DF using “FOR”. Name of my files:
FILE1.CSV; FILE2.CSV; FILE3.CSV
#FILE1
NAME<- c("JOHN","DONALD","CARL")
PRICE <- c(50, 60, 70)
FILE1 <- data.frame(NAME,PRICE)
#FILE2
NAME<- c("MICHAEL","CRIS","MARY")
PRICE <- c(12, 33, 78)
CITY<- c("NY", "LA","LON")
FILE2 <- data.frame(NAME,PRICE,CITY)
#FILE3
NAME<- c("PAUL","BROWN","WAL")
PRICE <- c(99, 54, 22)
CITY<- c("PAR","RIO","LIS")
POP<- c(150,369,871)
FILE3 <- data.frame(NAME,PRICE,CITY,POP)
Before turning them into DF I want to treat each file. Suppose that the import, treatment and transformation in DF has this sequence:
#PART 1
require(tidyverse)
setwd("D:/")
#PART 2
list_file <- list.files(pattern = "*.csv") %>% lapply(read.csv, sep=";")
I have an error here, because only the first file (FILE1) is transformed into DF and the rest are not transformed. I don't know how to fix it.
# PART 3
for (i in 1:seq_along(list_file)){
DF<- as_tibble(list_file[[i]]) %>% select(NAME,PRICE) # Only the variables “NAME” and “PRICE” will be used.
}
From here I want to import FILE2
, treat it and add it to the existing DF (rbind). And so on until my last file (FILE3)
. Therefore, as I would do to: 1) import; 2) handle the file and 3) add to a DF
?
Upvotes: 1
Views: 74
Reputation: 886938
In the loop, we need to store it in a list
as 'DF' is getting updated in each iteration
lst1 <- vector('list', length(list_file))
for (i in seq_along(list_file)){
lst1[[i]] <- as_tibble(list_file[[i]]) %>%
select(NAME,PRICE)
}
From the list
, we can use bind_rows
from dplyr
to bind them rbind
won't work if the column names are not matching or have extra columns in one of the list
element and not in the other
bind_rows(lst1)
Also, as we are using tidyverse functions
library(dplyr)
library(tidyr)
library(purrr)
library(readr)
out <- map_dfr(list.files(pattern = "*\\.csv"), ~
read_csv(.x) %>%
select(NAME, PRICE)
)
Or if we use fread
, there is an option to select
the columns of interest
library(data.table)
rbindlist(lapply(list.files(pattern = "*\\.csv"), fread,
select = c("NAME", "PRICE")))
Upvotes: 2