Kathia
Kathia

Reputation: 492

Error in Data Frame: arguments imply differing number of rows

Despite reading the existing answers about this Error, I still don't know how to fix this problem in my particular case.

I have to get the sum of complete cases in a list of files. Each file (e.g. file1 corresponds to an id (e.g id1 for file1). My goal is to get a data frame with the number of complete cases for each id (therefore for each file, as file1 contains the pollutants of id1, and file2 contains the pollutants of id2 and so on)

When I run the function: complete("pollu", 1:10) --> everything works perfectly

But when I run the functions:

 complete("pollu", 34)

I get ID 34 times, with 33 times returning NA and finally returning the number of complete cases.

 complete(".", c(2, 4, 8, 10, 12)) 

I get the error:

Error in data.frame(id, nobs) : arguments imply differing number of rows: 5, 12

Any help on understanding the error and fixing it would be appreciated.

complete <- function(directory,id=1:332) {
  nobs <- vector()
  files <- list.files(directory)

  for (i in id) {
    ID <- id
    file <- read.csv(files[i])
    nobs[i] <- sum(complete.cases(file),na.rm = TRUE)

}

df <- data.frame(ID,nobs)
colnames(df) <- c("ID", "nobs")
return (df)

}

Upvotes: 2

Views: 20984

Answers (1)

Jeremy Voisey
Jeremy Voisey

Reputation: 1297

The problem lies in the for loop and how you've assigned a value to nobs[i]

complete("pollu", 34)

The loop only runs once with i <- 34. But you assign a result to nobs[i], which is actually nobs[34]. This gives you a vector with the 34th value assigned, leaving the others NA by default.

complete(".", c(2, 4, 8, 10, 12)) 

The loop iterates over your 5 values. The biggest one being 12. In the last iteration you assign a value to nobs[12] so your nobs vector has length 12, while i has only length 5.

To fix

 for (i in seq_along(id))) {
    ID <- id[i]
    file <- read.csv(files[ID])
    nobs[i] <- sum(complete.cases(file),na.rm = TRUE)
}

i will takes the values 1, 2, 3.. upto the number of ids you require.

EDIT

As id already contains the labels your require, you can use

df <- data.frame(id, nobs)

Upvotes: 3

Related Questions