Vaidy
Vaidy

Reputation: 11

R data frame manipulation/transformation using ldply

I am trying to write some R code which will take the iris dataset and do a log transform of the numeric columns as per some criterion, say if skewness > 0.2. I have tried to use ldply, but it doesn't quite give me the output I want. It is giving me a transposed data frame, the variable names are missing and the non-numeric column entries are messed up.

Before posting this question I searched and found the following related topics but didn't quite meet what exactly I was looking for

Selecting only numeric columns from a data frame

extract only numeric columns from data frame data

Below is the code. Appreciate the help!

data(iris)
df <- iris
df <- ldply(names(df), function(x)
  { 
  if (class(df[[x]])=="numeric") 
    {
    tmp <- df[[x]][!is.na(df[[x]])]
    if (abs(skewness(tmp)) > 0.2) 
      {
       df[[x]] <- log10( 1 + df[[x]]  )
      }
    else df[[x]] <- df[[x]]
  }
  else df[[x]] <- df[[x]]
  #df[[x]] <- data.frame(df[[x]])
  #df2 <- cbind(df2, df[[x]])
  #return(NULL)
   }
  )

Upvotes: 1

Views: 600

Answers (2)

akrun
akrun

Reputation: 887048

We can use lapply

library(e1071)
lapply(iris, function(x) if(is.numeric(x) & abs(skewness(x, na.rm = TRUE)) > 0.2) 
                      log10(1+x) else x)

We can also loop by the columns of interest after creating a logical index

i1 <- sapply(iris, is.numeric)
i2 <- sapply(iris[i1], function(x) abs(skewness(x, na.rm = TRUE)) > 0.2)
iris[i1][i2] <- lapply(iris[i1][i2], function(x) log10(1+x))

Upvotes: 0

thepule
thepule

Reputation: 1751

Try with lapply:

#Skewness package
library(e1071)

lapply(iris, function(x) {
        if(is.numeric(x)){ 
            if(abs(skewness(x, na.rm = T))>0.2){
            log10(1 + x)} else x
            }
    else x
    })

Upvotes: 0

Related Questions