Reputation: 105
I Have multiple csv files that i have already read into R. Now I want to append all these into one file. I tried few things but getting different errors. Can anyone please help me with this?
TRY 1:
mydata <- rbind(x1,x2,x3,x4,x5,x6,x7,x8)
WHERE XI,X2....X8 Are the CSV files I read into R, error I am getting is
ERROR 1 :In [<-.factor
(*tmp*
, ri, value = c(NA, NA, NA, NA, NA, NA, NA, :
invalid factor level, NA generated
TRY 2: Then I try this in another way :
mydata1<- c(x1,x2,x3,x4,x5,x6,x7,x8)
> mydata2 <- do.call('rbind',lapply(mydata1,read.table,header=T))
Error 2: in FUN(X[[i]], ...) : 'file' must be a character string or connection
can anyone please help me know what is the right way to do this?
Upvotes: 1
Views: 20219
Reputation: 31
The accepted answer above generates the error shown in the comments because the do.call
requires the "fullpath"
parameter. Use the code as shown to use in the directory of your choice:
dataset <- do.call("rbind",lapply(fullpath,FUN=function(files){ read.csv(files)}))
Upvotes: 3
Reputation: 1079
You can use a combination of lapply(), and do.call().
## cd to the csv directory
setwd("mycsvs")
## read in csvs
csvList <- lapply(list.files("./"), read.csv, stringsAsFactors = F)
## bind them all with do.call
csv <- do.call(rbind, csvList)
You can also use fread()
function from the data.table package and rbindlist()
instead for a performance boost.
Upvotes: 1
Reputation: 3843
Sample CSV Files
Note
CSV files to be merged here have
- equal number of columns
- same column names
- same order of columns
- number of rows can be different
1st csv file abc.csv
A,B,C,D
1,2,3,4
2,3,4,5
3,4,5,6
1,1,1,1
2,2,2,2
44,44,44,44
4,4,4,4
4,4,4,4
33,33,33,33
11,1,11,1
2nd csv file pqr.csv
A,B,C,D
1,2,3,40
2,3,4,50
3,4,50,60
4,4,4,4
5,5,5,5
6,6,6,6
List FILENAMES of CSV Files
Note
The path below E:/MergeCSV/
has just the files to be merged. No other csv files. So in this path, there are only two csv files, abc.csv
and pqr.csv
## List filenames to be merged.
filenames <- list.files(path="E:/MergeCSV/",pattern="*.csv")
## Print filenames to be merged
print(filenames)
## [1] "abc.csv" "pqr.csv"
FULL PATH to CSV Files
## Full path to csv filenames
fullpath=file.path("E:/MergeCSV",filenames)
## Print Full Path to the files
print(fullpath)
## [1] "E:/MergeCSV/abc.csv" "E:/MergeCSV/pqr.csv"
MERGE CSV Files
## Merge listed files from the path above
dataset <- do.call("rbind",lapply(filenames,FUN=function(files){ read.csv(files)}))
## Print the merged csv dataset, if its large use `head()` function to get glimpse of merged dataset
dataset
# A B C D
# 1 1 2 3 4
# 2 2 3 4 5
# 3 3 4 5 6
# 4 1 1 1 1
# 5 2 2 2 2
# 6 44 44 44 44
# 7 4 4 4 4
# 8 4 4 4 4
# 9 33 33 33 33
# 10 11 1 11 1
# 11 1 2 3 40
# 12 2 3 4 50
# 13 3 4 50 60
# 14 4 4 4 4
# 15 5 5 5 5
# 16 6 6 6 6
head(dataset)
# A B C D
# 1 1 2 3 4
# 2 2 3 4 5
# 3 3 4 5 6
# 4 1 1 1 1
# 5 2 2 2 2
# 6 44 44 44 44
## Print dimension of merged dataset
dim(dataset)
## [1] 16 4
Upvotes: 4
Reputation: 2050
How to import all files from a single folder at once and bind by row (e.g., same format for each file.)
library(tidyverse)
list.files(path = "location_of/data/folder_you_want/",
pattern="*.csv",
full.names = T) %>%
map_df(~read_csv(.))
If there is a file that you want to exclude then
list.files(path = "location_of/data/folder_you_want/",
pattern="*.csv",
full.names = T) %>%
.[ !grepl("data/folder/name_of_file_to_remove.csv", .) ] %>%
map_df(~read_csv(.))
Upvotes: 6