Reputation: 41
I'm new to R and I can't make this work with the information I'm finding.
I have many .txt files in a folder, each of them containing data from one subject. The files have identical columns, but the number of rows for each file varies. In addition, the column headers only start in row 9. What I want to do is
I managed to do 1 (I think) using the easycsv package and the following code:
fread_folder(directory = "C:/Users/path/to/my/files",
extension = "TXT",
sep = "auto",
nrows = -1L,
header = "auto",
na.strings = "NA",
stringsAsFactors = FALSE,
verbose=getOption("datatable.verbose"),
skip = 8L,
drop = NULL,
colClasses = NULL,
integer64=getOption("datatable.integer64"),# default:"integer64"
dec = if (sep!=".") "." else ",",
check.names = FALSE,
encoding = "unknown",
quote = "\"",
strip.white = TRUE,
fill = FALSE,
blank.lines.skip = FALSE,
key = NULL,
Names=NULL,
prefix=NULL,
showProgress = interactive(),
data.table=FALSE
)
That worked, however now my problem is that the data frames have been named after the very long path to my files and obviously after the txt files (without the 7 though). So they are very long and unwieldy and contain characters that they probably shouldn't, such as spaces.
So now I'm having trouble merging the data frames into one, because I don't know how else to refer to the data frames other than the default names that have been given to them, or how to rename them, or how to specify how the data frames should be named when importing them in the first place.
Upvotes: 0
Views: 2737
Reputation: 4358
The following should work well. However, without sample data or a more clear description of what you want it's hard to know for certain if this if what you are looking to accomplish.
#set working directory
setwd("C:/Users/path/to/my/files")
#read in all .txt files but skip the first 8 rows
Data.in <- lapply(list.files(pattern = "\\.txt$"),read.csv,header=T,skip=8)
#combines all of the tables by column into one
Data.in <- do.call(rbind,Data.in)
Upvotes: 1
Reputation: 570
The code below looks for what files are in your directory, uses those names to get the file as a variable, and then uses rbindlist to combined the tables into a single table. Hope that helps. It assumes each .csv or .txt file in the directory has been pulled into the current environment as a separate data.table.
for (x in (list.files(directory))) {
# Remove the .txt extension from the filename to get the table name
if (grepl(".txt",x)) {
x = gsub(".txt","",x)
}
thisTable <- get(x) # use "get" to pull in the string as a variable
# now just combined into a single dataframe
if (exists("combined")) {
combined = rbindlist(list(combined,thisTable))
} else {
combined <- thisTable
}
}
Upvotes: 1