Reputation: 11
I am trying to write some R code which will take the iris dataset and do a log transform of the numeric columns as per some criterion, say if skewness > 0.2. I have tried to use ldply, but it doesn't quite give me the output I want. It is giving me a transposed data frame, the variable names are missing and the non-numeric column entries are messed up.
Before posting this question I searched and found the following related topics but didn't quite meet what exactly I was looking for
Selecting only numeric columns from a data frame
extract only numeric columns from data frame data
Below is the code. Appreciate the help!
data(iris)
df <- iris
df <- ldply(names(df), function(x)
{
if (class(df[[x]])=="numeric")
{
tmp <- df[[x]][!is.na(df[[x]])]
if (abs(skewness(tmp)) > 0.2)
{
df[[x]] <- log10( 1 + df[[x]] )
}
else df[[x]] <- df[[x]]
}
else df[[x]] <- df[[x]]
#df[[x]] <- data.frame(df[[x]])
#df2 <- cbind(df2, df[[x]])
#return(NULL)
}
)
Upvotes: 1
Views: 600
Reputation: 887048
We can use lapply
library(e1071)
lapply(iris, function(x) if(is.numeric(x) & abs(skewness(x, na.rm = TRUE)) > 0.2)
log10(1+x) else x)
We can also loop by the columns of interest after creating a logical index
i1 <- sapply(iris, is.numeric)
i2 <- sapply(iris[i1], function(x) abs(skewness(x, na.rm = TRUE)) > 0.2)
iris[i1][i2] <- lapply(iris[i1][i2], function(x) log10(1+x))
Upvotes: 0
Reputation: 1751
Try with lapply
:
#Skewness package
library(e1071)
lapply(iris, function(x) {
if(is.numeric(x)){
if(abs(skewness(x, na.rm = T))>0.2){
log10(1 + x)} else x
}
else x
})
Upvotes: 0