A. Bohyn
A. Bohyn

Reputation: 187

Apply a loop to a data frame

I'm trying to apply the function RE.Johnson from the Johnson package to a whole data frame df that contains 157 observations of 16 variables and i'd like to loop trough all the dataframe instead of doing it manually. I've tried the following code but it doesn't work.

lapply(df[1:16], function(x) RE.Johnson(x))

I know it might seem easy for you guys but I'm juste starting with R. Thanks

EDIT

R provides me the answer Error in RE.ADT(xsl[, i]) : object 'p' not found and the data are not transformed. And here is a summary of the data:

data.frame':    157 obs. of  16 variables:
$ X         : num  786988 781045 777589 775266 786843 ...
$ Y         : num  486608 488691 490089 489293 488068 ...
$ Z         : num  182 128 191 80 131 ...
$ pH        : num  7.93 7.69 7.49 7.66 7.92 7.08 7.24 7.19 7.44 7.37 ...
$ CE        : num  0.775 3.284 3.745 4.072 0.95 ...
$ Nitrate   : int  21 14 18 83 30 42 47 101 85 15 ...
$ NP        : num  19.6 43.6 31.7 18.6 31.7 ...
$ Cl        : num  1.9 21.3 2.56 21.5 3.2 ...
$ HCO3      : num  6.65 4.85 4.4 7.72 4.1 ...
$ CO3       : num  0 0 0 0 0.0736 ...
$ Ca        : num  4.12 7.52 3.48 7.58 4.8 10 4.4 4.6 4.2 7.4 ...
$ Mg        : num  3.94 8.92 2.34 7.1 2.5 ...
$ K         : num  0.1442 0.0759 0.0709 0.3691 0.07 ...
$ Na        : num  2.41 34.55 2.51 44.01 2.1 ...
$ SO4       : num  1.45 23.6 1.2 26.66 2 ...
$ Residu_sec: num  0.496 2.102 2.397 2.606 0.608 ...

Upvotes: 1

Views: 341

Answers (2)

André Tavares
André Tavares

Reputation: 45

The problem is when the function try to perform the Anderson-Darling test to a vector of equals values. If you do this, you will get the error:

require(Johnson)
x = rep(1,n=100)
RE.ADT(x)

So, to solve this problem you could check it in the IF session inside the function RE.Johnson:

    if (xsb.valida[1, i] == 0 & any(xsb[, i]!=xsb[1, i])){
        xsb.adtest[1, i] <- (RE.ADT(xsb[, i])$p)
    }else{
        xsb.adtest[1, i] <- 0
    }   
    if (xsl.valida[1, i] == 0 & any(xsl[, i]!=xsl[1, i])) {
        xsl.adtest[1, i] <- (RE.ADT(xsl[, i])$p)
    }else{
        xsl.adtest[1, i] <- 0
    }
    if (xsu.valida[1, i] == 0 & any(xsu[, i]!=xsu[1, i])) {
        xsu.adtest[1, i] <- (RE.ADT(xsu[, i])$p)
    }else{
        xsu.adtest[1, i] <- 0
    }

Upvotes: 0

Oliver Frost
Oliver Frost

Reputation: 827

Not a complete solution, just some information for others.

I tried the Johnson::RE.Johnson manually on the columns in the iris data frame. It seems to be work fine for Sepal.Length and Petal.Length only:

lapply(iris[c(1,3)], Johnson::RE.Johnson)

... and it returns the error you mentioned for Sepal.Width and Petal.Width.

lapply(iris[c(2,4)], Johnson::RE.Johnson)

Error in RE.ADT(xsl[, i]) : object 'p' not found

This seems odd because all of those columns have a data type of num. The iris data frame doesn't appear to have any missing values or extra character values hidden anywhere, so I'm not sure why the calculation is working for those columns but not others.

Without understanding too much about what the Johnson::RE.Johnson is doing to the data, it looks like it is unable to calculate a value for p and is unable to complete the iteration for those columns.

From exploring the source code, the function appears to break down at this point:

  if (xsb.valida[1, i] == 0) 
    xsb.adtest[1, i] <- (Johnson::RE.ADT(xsb[, i])$p) # succeeds
  if (xsl.valida[1, i] == 0) 
    xsl.adtest[1, i] <- (Johnson::RE.ADT(xsl[, i])$p) # fails
  if (xsu.valida[1, i] == 0) 
    xsu.adtest[1, i] <- (Johnson::RE.ADT(xsu[, i])$p) # fails

The function attempts to run Johnson::RE.ADT on xsl, which at this point is a vector of just 0's. The RE.ADT returns the same error with the p value not being found.

Upvotes: 1

Related Questions