AKumar
AKumar

Reputation: 13

How to bind tables in a for loop

I am trying to create a table that summarizes data from a dataset. I have:

set.seed(123) 
age <- runif(100, 1, 100)
gender <- sample(c("Male", "Female"), 100, replace=TRUE)
bmi <- rep(c("Normal"), 100)
height <- runif(100, 150, 190)
smoker <- sample(c("Yes", "No"), 100, replace=TRUE)

finaldata <- data.frame(age, gender, bmi, height, smoker)
str(finaldata)
continuous <- finaldata[ ,c(1, 4)]
categorical <- finaldata[ ,c(2, 3, 5)]


Table1 <- function(CONT, CAT, DIGITS=2){
table_cont <- matrix(0, ncol=2, nrow=ncol(CONT))
for (i in 1:ncol(CONT)){
table_cont[i, ] <- c(round(mean(CONT[ ,i]), DIGITS), round(sd(CONT[ ,i]), DIGITS))
}

cats <- function(VARIABLE){
table_cat <- matrix(0, ncol=2, nrow=dim(table(CAT[ ,VARIABLE])))
for (i in 1:dim(table(CAT[ ,VARIABLE]))){
table_cat[i, ] <- c(table(CAT[ ,VARIABLE])[i], paste(round(prop.table(table(CAT[ ,VARIABLE]))[i]*100, DIGITS), "%"))
}
rownames(table_cat) <- levels(CAT[, VARIABLE])
table_cat <- rbind(rep("", ncol=ncol(table_cat)), table_cat)
return(table_cat)
}
table_cat <- rbind(cats(1), cats(2), cats(3))

descriptives <- rbind(table_cont, table_cat)
return(descriptives)
}
Table1(continuous, categorical)

It works fine. That said, for binding the categorical variables, I am doing rbind(cats(1), cats(2), cats(3)). While that is ok for this dataset, I don't want to have to keep altering that for every other dataset I use. I tried binding them in a for-loop but was unsuccessful. How does one go about binding them without repetitively specifying rbind(cats(1), cats(2), cats(3))?

Upvotes: 1

Views: 1440

Answers (3)

bramtayl
bramtayl

Reputation: 4024

You want to do this instead:

library(dplyr)
library(tidyr)

better_summary = function(data){
  continuous = data %>% Filter(is.numeric, .)
  categorical = data %>% Filter(. %>% is.numeric %>% `!`, .)

  continuous_summary = 
    continuous %>%
    gather(variable, value) %>%
    group_by(variable) %>%
    summarize(mean = mean(value),
              sd = sd(value))

  categorical_summary = 
    categorical %>%
    gather(variable, value) %>%
    count(variable, value) %>%
    mutate(percent = n / sum(n))

  list(continuous_summary = continuous_summary,
       categorical_summary = categorical_summary)
}

Upvotes: 0

road_to_quantdom
road_to_quantdom

Reputation: 1361

try this:

table_cat <- data.frame()
# N here is the number of cat() function calls you plan on making
for(i in 1:N){
   table_cat <- rbind(table_cat,cat(i))
}

if you do not want that rownames issue try this:

table_cat <- matrix(nrow=0,ncol=ncol(cats(1)))
  for(i in 1:3){
    table_cat <- rbind(table_cat,cats(i))
  }

Upvotes: 2

Max Candocia
Max Candocia

Reputation: 4385

Unless your rows are dependent on each other, you should use functions like apply or plyr's ddply to process the data without all of the for loops.

cont.func <- function(CONT.col, DIGITS=2){
     c(round(mean(CONT.col), DIGITS), round(sd(CONT.col), DIGITS))
}
CONT = t(apply(continuous,2,cont.func))
cat.func <- function(CAT.col,DIGITS=2){
tab = table(CAT.col)
rbind(cbind(tab, paste0(round(prop.table(tab)*100, DIGITS), "%")),"")
}
CAT = do.call("rbind",apply(categorical,2,cat.func))
rbind(CONT,c("",""),CAT)

Also, you can use as.data.frame around the rbind call in cat.func to preserve the categorical variable name when creating CAT. This may be preferable to using blank quotes depending on your needs.

Upvotes: 0

Related Questions