Amaranta_Remedios
Amaranta_Remedios

Reputation: 773

Multiple text into dataframe in R

I have 50 text files all with multiple words like this

View(file1.txt)
one
two
three
four
cuatro

View(file2)
uno
five
seis
dos

Each file has only one row of words and different lengths. I want to create a dataframe in R that has the content of each file into a column and the column name is the file name.

   file1    file2  ...........etc
1  one      uno
2  two      five
3  three    seis
4  four     dos
5  cuatro   

So far I have loaded all the files into a list like this:

files<- lapply(list.files(pattern = "\\.txt$"),read.csv,header=F) 
> class(files)
[1] "list"
df <- data.frame(matrix(unlist(files), ncol= length(files)))

which is definitely close but wrong because there are not holes (and some columns should have more data than others) and its also not automatically naming the columns.

Upvotes: 1

Views: 130

Answers (2)

The idea is to get file with the max length, and use that length to complete the others (with fewer lengths) filling up with NA in order to make it possible to work with multiple vectors.
You can achieve that with different approaches. Here it's a way to do that.

files <- sapply(list.files(pattern = "\\.txt$"), readLines)
max_len <- max(sapply(files_data, length))

df <- data.frame(sapply(seq_along(files), function(i) {
  len <- length(files[[i]])
  if(len < max_len) {
    files[[i]] <- append(files[[i]], rep(NA, max_len - len))
  } else {
    files[[i]]
  }
}))

names(df) <- basename(tools::file_path_sans_ext(names(files)))

Upvotes: 1

zx8754
zx8754

Reputation: 56189

Try this, get filenames, read them in, get the maximum number of rows, then extend the number of rows. Finally, convert to data.frame:

f <- list.files(pattern = "\\.txt$", full.names = TRUE)
names(f) <- tools::file_path_sans_ext(basename(f))

res <- lapply(f, read.table)

maxRow <- max(sapply(res, nrow))

data.frame(lapply(res, function(i) i[seq(maxRow), ]))

#    file1 file2
# 1    one   uno
# 2    two  five
# 3  three  seis
# 4   four   dos
# 5 cuatro  <NA>

Upvotes: 2

Related Questions