Seyma Kalay
Seyma Kalay

Reputation: 2861

Summary statistic with ifelse statement in R

I want to set my custom summary stats function and if the column is a factor I want to see a prop.table rather than summary stats.

    set.seed(123)
df <- data.frame(replicate(6, sample(c(1:10, -99),6, rep = T)))
df$X7 <- factor(df$X6, levels = c(7,9,10)); str(df)
    
    
    summary <- function(x){
      if (is.numeric(x)){
      funs <- c(mean, median, sd, mad, IQR)
      lapply(funs, function(f) f(x, na.rm = T))
      }
      else 
      df[] <- {lapply(df, prop.table)} #not sure how to save the outcome
    }
    
    summary(df)

Expected Answer

                mean   median  sd  mad  IQR
X1            mean(X1)
X2            mean(X2)     
X3            
X4
X5
X6
X7.Factor7  prop.table(X7.Factor7)
X7.Factor9  prop.table(X7.Factor9)
X7.Factor10 prop.table(X7.Factor10)

Upvotes: 0

Views: 120

Answers (1)

dcarlson
dcarlson

Reputation: 11056

You need to re-think how you want your output to appear. The descriptive stats and the table will not be easy to combine since they could appear in any order. Here is a way to get started thinking about it.

stats <- function(x) {
    if (is.numeric(x)) {
        c(mean=mean(x), median=median(x), sd=sd(x), mad=mad(x), IQR=IQR(x))
    } else {
        prop.table(table(x))
    }
}
result <- sapply(df, stats)
result
# $X1
#      mean    median        sd       mad       IQR 
# -12.50000   3.00000  42.47705   2.96520   3.00000 
# 
# $X2
#      mean    median        sd       mad       IQR 
# -10.83333   5.50000  43.25467   3.70650   4.00000 
# 
# $X3
#      mean    median        sd       mad       IQR 
# -10.66667   7.00000  43.34820   2.96520   5.50000 
# 
# $X4
#     mean   median       sd      mad      IQR 
# 7.833333 8.500000 2.639444 2.223900 2.500000 
# 
# $X5
#      mean    median        sd       mad       IQR 
# -13.16667   3.50000  42.09711   2.96520   3.25000 
# 
# $X6
#     mean   median       sd      mad      IQR 
# 8.666667 9.000000 1.366260 1.482600 2.250000 
# 
# $X7
# x
#         7         9        10 
# 0.3333333 0.3333333 0.3333333 

You can combine the numeric vectors with

num <- sapply(df, is.numeric)
do.call(rbind, result[num])

But you will still have to deal with the table/tables separately.

Upvotes: 1

Related Questions