Helen
Helen

Reputation: 107

Error message when using lapply to apply a function to multiple dataframes in a list.

My dataset looks like this, and I have a list of data.

   Plot_ID Canopy_infection_rate DAI 
1  YO01    5                     7   
2  YO01    8                     14
3  YO01    10                    21

What I want to do is to apply a function called "audpc_Canopyinfactionrate" to a list of dataframes.

However, when I run lapply, I get an error as below:

Error in FUN(X[[i]], ...) : argument "DAI" is missing, with no default

I've checked my list that my data does not shift a column.

Does anyone know what's wrong with it? Thanks

Here is part of my code:

#Read files in to list

for(i in 1:length(files)) {
  lst[[i]] <- read.delim(files[i], header = TRUE,  sep=" ")
}

#Apply a function to the list
densities <- list()
densities<- lapply(lst, audpc_Canopyinfactionrate)

#canopy infection rate 
audpc_Canopyinfactionrate <- function(Canopy_infection_rate,DAI){
  n <- length(DAI)
  meanvec <- matrix(-1,(n-1))
  intvec <- matrix(-1,(n-1))
  for(i in 1:(n-1)){
    meanvec[i] <- mean(c(Canopy_infection_rate[i],
                         Canopy_infection_rate[i+1]))
    intvec[i] <- DAI[i+1] - DAI[i]
  }

  infprod <- meanvec * intvec
  sum(infprod)

}

Upvotes: 1

Views: 1285

Answers (1)

KenHBS
KenHBS

Reputation: 7164

As pointed out in the comments, the problem lies in the way you are using lapply.

This function is built up like this: lapply(X, FUN, ...). FUN is the name of a function used to apply to the elements in a data.frame/list called X. So far so good.

Back to your case: You want to apply a function audpc_Canopyinfactionrate() to all data frames in lst. This function takes two arguments. And I think this is where things got mixed up in your code. Make sure you understand that in the way you are using lapply, you use lst[[1]], lst[[2]], etc. as the only argument in audpc_Canopyinfactionrate(), whereas it actually requires two arguments!

If you reformulate your function a bit, you can use lst[[1]], lst[[2]] as the only argument to your function, because you know that argument contains the columns you need - Canopy_infection_rate and DAI:

audpc_Canopyinfactionrate <- function(df){
  n <- nrow(df)
  meanvec <- matrix(-1, (n-1))
  intvec  <- matrix(-1, (n-1))
  for(i in 1:(n-1)){
    meanvec[i] <- mean(c(df$Canopy_infection_rate[i],
                         df$Canopy_infection_rate[i+1]))
    intvec[i] <- df$DAI[i+1] - df$DAI[i]
  }

  infprod <- meanvec * intvec
  return(sum(infprod))    
}

Call lapply in the following way:

lapply(lst, audpc_Canopyinfactionrate)

Note: lapply can also be used with more than 1 argument, by using the ... in lapply(X, FUN, ...). In your case, however, I think this is not the best option.

Upvotes: 2

Related Questions