user1804925
user1804925

Reputation: 159

R store the for loop output in vector

I am R beginner, the following is my code:

complete <- function(directory, id = 1:332) {


# Read through all the csv data file
for (i in id) {
    i <- sprintf("%03d", as.numeric(i))
    data <- read.csv(paste(directory, "/", i, ".csv", sep =""))
    good <- complete.cases(data)   # Eliminating the NA rows
    cases <- sum(good == TRUE)  # add complete value    
} 


data.frame(id = id, nobs = cases )
}

when I print the output

 id nobs
1  1  402
2  2  402
3  3  402
4  4  402
5  5  402          (incorrect)

if I just print the cases

[1] 117
[1] 1041
[1] 243
[1] 474
[1] 402

so the correct output should be

  id nobs
1  1  117
2  2 1041
3  3  243
4  4  474
5  5  402

I realize it only take last value from the (cases).

My question is how can I store the (cases) output into a vector so when I call the data.frame function it will return the correct output.

thanks

Upvotes: 0

Views: 5068

Answers (3)

oliver
oliver

Reputation: 1

complete <- function(directory ,id = 1:332){
  folder = directory
  df_total = data.frame()
  for (x in id){
    filenames <- sprintf("%03d.csv", x) 
    filenames <- paste(folder,filenames,sep="\\")
    df <- do.call(rbind,lapply(filenames,read.csv, header=TRUE))
    my_vector <- sum(complete.cases(enter the column for which you want))
    df1 <- data.frame(id=x,nobs=my_vector)
    df_total <- rbind(df_total,df1)
  }
  df_total
}

Upvotes: 0

Sven Hohenstein
Sven Hohenstein

Reputation: 81733

This is a more efficient function for the task:

complete <- function(directory, id = 1:332) {
  filenames <- file.path(directory, paste0(sprintf("%03d", id), ".csv"))
  data.frame(id = id, 
             nobs = sapply(filenames, function(x) 
                                        sum(complete.cases(read.csv(x)))))
}

Upvotes: 1

EDi
EDi

Reputation: 13310

This should do the job, if id is a numeric vector (untested since you provided no reprodicible example!)

Otherwise you should use for(i in seq_along(id)) and id[i] inside the loop.

complete <- function(directory, id = 1:332) {

cases <- NULL
# Read through all the csv data file
for (i in id) {
    i <- sprintf("%03d", as.numeric(i))
    data <- read.csv(paste(directory, "/", i, ".csv", sep =""))
    good <- complete.cases(data)   # Eliminating the NA rows
    cases[i] <- sum(good == TRUE)  # add complete value    
} 


data.frame(id = id, nobs = cases )
}

Upvotes: 1

Related Questions