Sergi Solé
Sergi Solé

Reputation: 13

R: Get the mean for each column with the loop for

I am required by the assignment to use loop for to determine the column means. I have the following code:

numericvars <- NULL
for (n in names(millas)){
  if(class(millas[,n]) == 'integer' | class(millas[,n]) == 'numeric'){
    numericvars[n] <- mean(millas[,n], na.rm = TRUE)
  }
}
  
numericvars 

I get this error though.

Warning in if (class(millas[, n]) == "integer" | class(millas[, n]) == "numeric") { :
  the condition has length > 1 and only the first element will be used

It is a tibble tbl_df:

> dput(head(millas,10))
structure(list(fabricante = c("audi", "audi", "audi", "audi", 
"audi", "audi", "audi", "audi", "audi", "audi"), modelo = c("a4", 
"a4", "a4", "a4", "a4", "a4", "a4", "a4 quattro", "a4 quattro", 
"a4 quattro"), cilindrada = c(1.8, 1.8, 2, 2, 2.8, 2.8, 3.1, 
1.8, 1.8, 2), anio = c(1999L, 1999L, 2008L, 2008L, 1999L, 1999L, 
2008L, 1999L, 1999L, 2008L), cilindros = c(4L, 4L, 4L, 4L, 6L, 
6L, 6L, 4L, 4L, 4L), transmision = c("auto(l5)", "manual(m5)", 
"manual(m6)", "auto(av)", "auto(l5)", "manual(m5)", "auto(av)", 
"manual(m5)", "auto(l5)", "manual(m6)"), traccion = c("d", "d", 
"d", "d", "d", "d", "d", "4", "4", "4"), ciudad = c(18L, 21L, 
20L, 21L, 16L, 18L, 18L, 18L, 16L, 20L), autopista = c(29L, 29L, 
31L, 30L, 26L, 26L, 27L, 26L, 25L, 28L), combustible = c("p", 
"p", "p", "p", "p", "p", "p", "p", "p", "p"), clase = c("compacto", 
"compacto", "compacto", "compacto", "compacto", "compacto", "compacto", 
"compacto", "compacto", "compacto")), row.names = c(NA, -10L), class = c("tbl_df", 
"tbl", "data.frame"))

Upvotes: 0

Views: 739

Answers (1)

Dion Groothof
Dion Groothof

Reputation: 1456

As @r2evans rightfully commented, it is not advised to use a for loop to compute column means.

Anyway, I believe this will work for you (assuming millas is a data.frame).

mean_fun <- \(df) {
    column_means <- list(); name <- c()
    for (i in 1:ncol(df)) {
        if (class(df[[i]]) %in% c('numeric', 'integer')) {
            column_means[i] <- mean(df[[i]])
            name[i] <- i
        }
    }
    names(column_means) <- colnames(df)[name]
    return(unlist(column_means))
}

Result

> mean_fun(millas)
cilindrada       anio  cilindros     ciudad  autopista 
      2.19    2002.60       4.60      18.60      27.70 

Upvotes: 1

Related Questions