Reputation: 477
I have a directory called "specdata" that contains csv files (such as 001.csv,002.csv,...,332.csv). Now I want my function to read all the files in this directory and return a data.frame where the first column is the name of the file and the second column is the number of complete cases.
for example:
id nobs
1 108
2 345
...
etc
Now, I wrote this function that reads all the files in "specdata" directory and generates the sum of complete cases in each file. But I do not know how to put each no. generated by "nobs" from the loop into the new data.frame in this format:
id nobs
1 108
2 345
...
...
332 16
My function:
complete <- function(directory, id = 1:332) {
for(i in 1:332)
{
if(i<10)
{
path<-paste(directory,"/00",id[i],".csv",sep="")
}
if(i>9 & i<100)
{
path<-paste(directory,"/0",id[i],".csv",sep="")
}
if(i>99 & i<333)
{
path<-paste(directory,"/",id[i],".csv",sep="")
}
mydata<-read.csv(path)
#nobs<-nrow(na.omit(mydata))
nobs<-sum(complete.cases(mydata))
}
}
the problem is that "nobs" dynamically gets created row-by-row in the for loop and I want to put the entire list of "nobs" for all the files into data.frame. I have tried lot of ways but am unable to put entire list of "nobs" into the data.frame along with the "id" numbers.
Can someone please suggest a way to return the data.frame in requested order?
Upvotes: 0
Views: 2334
Reputation: 60060
The simplest way to build up a list of all the nobs
values goes something like this:
complete <- function(directory, id = 1:332) {
# Create an empty vector outside the for loop
nobs_vector <- c()
for(i in 1:332)
{
if(i<10)
{
path<-paste(directory,"/00",id[i],".csv",sep="")
}
if(i>9 & i<100)
{
path<-paste(directory,"/0",id[i],".csv",sep="")
}
if(i>99 & i<333)
{
path<-paste(directory,"/",id[i],".csv",sep="")
}
mydata<-read.csv(path)
#nobs<-nrow(na.omit(mydata))
nobs<-sum(complete.cases(mydata))
# Add the value to the end of the vector
nobs_vector <- c(nobs_vector, nobs)
}
# Take a look at the final vector you end up with
print(nobs_vector)
}
It's not necessarily that elegant or efficient, but it does get you those values in a form that persists after the for loop is done. If you wanted to build up a dataframe in a similar way, have a look at ?rbind
Upvotes: 0