Reputation: 4575
I'm trying to import a large number of text files and merge them into a single datatable using the script below, so I can parse the text . The files were originally eml files so the formatting is a mess. I'm not interested in separating the text into fields, it would be perfectly fine if the datatable only had one field with all the text from the files in it. When I run the script below I keep getting the following error.
Error in rbind(deparse.level, ...) :
numbers of columns of arguments do not match
I've tried setting sep= various things or running it without it, but it still gives the same error. I've also tried running the same code except replacing read.table with read.csv, but again I get the same error. Any tips would be greatly appreciated.
setwd("~/stuff/folder")
file_list <- list.files()
for (file in file_list){
# if the merged dataset doesn't exist, create it
if (!exists("dataset")){
dataset <- read.table(file, header=FALSE,fill=TRUE,comment.char="",strip.white = TRUE)
}
# if the merged dataset does exist, append to it
if (exists("dataset")){
temp_dataset <-read.table(file, header=FALSE,fill=TRUE,comment.char="",strip.white = TRUE)
dataset<-rbind(dataset, temp_dataset)
rm(temp_dataset)
}
}
Upvotes: 0
Views: 160
Reputation: 1237
I think something lighter could work for you and may avoid this specific error:
them.files <- lapply(1:number.of.files,function(x)
read.table(paste(paste("lolz",x,sep=""),'txt',sep='.')),header=FALSE,fill=TRUE,comment.char="",strip.white = TRUE)
Adapt the function to whatever your files names are.
Edit: Actually maybe something like this could be better:
them.files <- lapply(1:length(file_list),function(x)
read.table(file_list[x],header=FALSE,fill=TRUE,comment.char="",strip.white = TRUE)
Merging step:
everyday.Im.merging <- do.call(rbind,them.files)
I am sure there are beautiful ways to do it with dplyr
or data.table
but I am a caveman.
If I may add something, I would also fancy a checking step prior the previous line of code:
sapply(them.files,str)
Upvotes: 1