Mefhisto1
Mefhisto1

Reputation: 2228

Appending a row to a dataframe while reading from multiple csv files in R

I'm reading from multiple csv files in a loop, and performing some calculations on each file's data, and then I wish to add that new row to a data frame:

for (i in csvFiles) {
    fileToBeRead<-paste(directory, i, sep="/")

    dataframe<-read.csv(paste(fileToBeRead, "csv", sep="."))
    file <- i
    recordsOK <- sum(complete.cases(dataframe))

    record.data <- data.frame(monitorID, recordsOK)
} 

So, I want to add file and recordsOK as a new row to the data frame. This just overwrites data frame every time, so I'd end up with the data from the latest csv file. How can I do this while preserving the data from the last iteration?

Upvotes: 0

Views: 1694

Answers (1)

MrFlick
MrFlick

Reputation: 206606

Building a data.frame one row at a time is almost always the wrong way to do it. Here'a more R-like solution

OKcount<-sapply(csvFiles, function(i) {
    fileToBeRead<-paste(directory, i, sep="/")

    dataframe<-read.csv(paste(fileToBeRead, "csv", sep="."))
    sum(complete.cases(dataframe))
})

record.data <- data.frame(monitorID=seq_along(csvFiles), recordsOK=OKcount)

The main idea is that you generally build your data column-wise, not row-wise, and then bundle it together in a data.frame when you're all done. Because R has so many vectorized operations, this is usually pretty easy.

But if you really want to add rows to a data.frame, you can rbind (row bind) additional rows in. So instead of overwriting record.data each time, you would do

record.data <- rbind(record.data, data.frame(monitorID, recordsOK)

But that means you will need to define record.data outside of your loop and initialize it with the correct column names and data types since only matching data.frames can be combined. You can initialize it with

record.data <- data.frame(monitorID=numeric(), recordsOK=numeric())

Upvotes: 1

Related Questions