Pranav Pandya
Pranav Pandya

Reputation: 477

Dynamically creating data.frame with the variable which gets change in the for loop

I have a directory called "specdata" that contains csv files (such as 001.csv,002.csv,...,332.csv). Now I want my function to read all the files in this directory and return a data.frame where the first column is the name of the file and the second column is the number of complete cases.

for example:

id nobs
1  108
2  345
...
etc

Now, I wrote this function that reads all the files in "specdata" directory and generates the sum of complete cases in each file. But I do not know how to put each no. generated by "nobs" from the loop into the new data.frame in this format:

id  nobs
1   108
2   345
...
...
332 16

My function:

complete <- function(directory, id = 1:332) {

for(i in 1:332)
  {
    if(i<10)
      {

      path<-paste(directory,"/00",id[i],".csv",sep="")
      }
    if(i>9 & i<100)
      {

      path<-paste(directory,"/0",id[i],".csv",sep="") 
      }
    if(i>99 & i<333)
      {

      path<-paste(directory,"/",id[i],".csv",sep="") 
      }  

    mydata<-read.csv(path)
    #nobs<-nrow(na.omit(mydata))
    nobs<-sum(complete.cases(mydata))

  }


}

the problem is that "nobs" dynamically gets created row-by-row in the for loop and I want to put the entire list of "nobs" for all the files into data.frame. I have tried lot of ways but am unable to put entire list of "nobs" into the data.frame along with the "id" numbers.

Can someone please suggest a way to return the data.frame in requested order?

Upvotes: 0

Views: 2334

Answers (1)

Marius
Marius

Reputation: 60060

The simplest way to build up a list of all the nobs values goes something like this:

complete <- function(directory, id = 1:332) {
  # Create an empty vector outside the for loop
  nobs_vector <- c()
  for(i in 1:332)
  {
    if(i<10)
    {
      path<-paste(directory,"/00",id[i],".csv",sep="")
    }
    if(i>9 & i<100)
    {
      path<-paste(directory,"/0",id[i],".csv",sep="") 
    }
    if(i>99 & i<333)
    {
      path<-paste(directory,"/",id[i],".csv",sep="") 
    }  

    mydata<-read.csv(path)
    #nobs<-nrow(na.omit(mydata))
    nobs<-sum(complete.cases(mydata))
    # Add the value to the end of the vector
    nobs_vector <- c(nobs_vector, nobs)
  }
  # Take a look at the final vector you end up with
  print(nobs_vector)
}

It's not necessarily that elegant or efficient, but it does get you those values in a form that persists after the for loop is done. If you wanted to build up a dataframe in a similar way, have a look at ?rbind

Upvotes: 0

Related Questions