Brendan Carlin
Brendan Carlin

Reputation: 21

Passing a Directory through a Function in R, Then Storing a Variable for Each CSV File?

While I thought I was on track to being an R guru in no time, my most recent problem sets were a rude awakening lol. I've searched this community and practiced a variety of tutorials before posting this question. Ultimately, I need to pass through a directory of CSV files and create a final data frame that shows the number of complete cases for each specific file. So if I wanted to search files[1:3] in the directory, a data frame would result showing the number of complete cases for each specific file 1 - X, 2 - Y, 3 - Z. When I run this code:

complete <- function(directory, id = 1:332) {
        files_list <- list.files(directory, full.names=TRUE)
        for(file in id){
                data <- data.frame()
                data <- rbind(data, read.csv(files_list[file], header=TRUE))
                nobs <- sum(complete.cases(data))


        }

        allnobs <- data.frame(id, nobs)
        allnobs


}

I receive a data.frame that lists the number of complete.cases for the final CSV file in ID on each row, whereas 192 should only pair with ID 8 and every other ID should have its own unique number of complete cases. My result with 192 listed for each ID:

> complete("specdata", 1:8)
  id nobs
1  1  192
2  2  192
3  3  192
4  4  192
5  5  192
6  6  192
7  7  192
8  8  192

I also tried moving the original data.frame created outside of the for loop:

complete <- function(directory, id = 1:332) {
        files_list <- list.files(directory, full.names=TRUE)
        data <- data.frame()
        for(file in id){
                data <- rbind(data, read.csv(files_list[file], header=TRUE))
                nobs <- sum(complete.cases(data))


        }

        allnobs <- data.frame(id, nobs)
        allnobs


}

--which ends up giving me the total of complete.cases observed in all files:

> complete("specdata", 1:8)
  id nobs
1  1 3139
2  2 3139
3  3 3139
4  4 3139
5  5 3139
6  6 3139
7  7 3139
8  8 3139

Any assistance here would be greatly appreciated.

Upvotes: 0

Views: 1519

Answers (1)

Gunter
Gunter

Reputation: 21

Here you go:## dir is your directory

complete<-function(dir,id)
{
  setwd("D:/R WD/assignment1")
  file_list <- list.files(dir, full.names = FALSE)
  setwd("D:/R WD/assignment1/specdata")

  nobs<-integer(length(id))
  p<- 1
  for(i in id)
  {
    data <- read.csv(file_list[i], header=TRUE)
    n<-sum(complete.cases(data))
    nobs[p]<-n
    p<-p+1
  }
  cbind(id,nobs)
}

The output:

> complete("specdata", 1:8)
     id nobs
[1,]  1  117
[2,]  2 1041
[3,]  3  243
[4,]  4  474
[5,]  5  402
[6,]  6  228
[7,]  7  442
[8,]  8  192

--Regards DUDU

Upvotes: 0

Related Questions