Morgan
Morgan

Reputation: 1

Can I create a function that compares numbers of a column to a mean for different columns?

For 6 different columns, I have to change all values to 'high'/'low' depending on whether the values are bigger or smaller than the mean of the column. I wrote a function that checks whether the values in a column are smaller/bigger than the mean, but it is only applicable to one of the columns:

calc.binary <- function(number){
  number[number > mean(modality$auditory)] <- 'high'
  number[!(number %in% 'high')] <- 'low'
  number
}

modality_bin <- mutate_at(modality, vars(auditory:visual), list(calc.binary))

I want to apply this function to all 6 columns at once now (columns auditory to visual), but I know I'd have to change the calc.binary function for that. How do I change my function so that it takes the mean of each column and compares the number in that column to that mean? So I can apply the same function to all columns.

Upvotes: 0

Views: 36

Answers (3)

Len Greski
Len Greski

Reputation: 10865

A solution with lapply() looks like this, using the mtcars data set as an example. We use a numeric vector to process the columns of mtcars, and add a reference to mtcars as a second argument to lapply(). We create the comparisons, rename the columns, and merge with the original data.

meansCompares <- lapply(1:ncol(mtcars),function(x,y){
     ifelse(y[x] >= mean(y[[x]]),"High","Low")
},mtcars)
compares <- do.call(cbind,meansCompares)
# rename cols so we can join with original data 
colnames(compares) <- paste0("comp_",colnames(compares))
head(cbind(mtcars,compares))

...and the output:

> head(cbind(mtcars,compares))
                   mpg cyl disp  hp drat    wt  qsec vs am gear carb comp_mpg comp_cyl comp_disp comp_hp comp_drat
Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4     High      Low       Low     Low      High
Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4     High      Low       Low     Low      High
Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1     High      Low       Low     Low      High
Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1     High      Low      High     Low       Low
Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2      Low     High      High    High       Low
Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1      Low      Low       Low     Low       Low
                  comp_wt comp_qsec comp_vs comp_am comp_gear comp_carb
Mazda RX4             Low       Low     Low    High      High      High
Mazda RX4 Wag         Low       Low     Low    High      High      High
Datsun 710            Low      High    High    High      High       Low
Hornet 4 Drive        Low      High    High     Low       Low       Low
Hornet Sportabout    High       Low     Low     Low       Low       Low
Valiant              High      High    High     Low       Low       Low

Upvotes: 0

Daniel O
Daniel O

Reputation: 4358

You should be able to use the apply function in Base-R

apply(modality[,c(2:5)], 2,function(x) ifelse(x > mean(x),'high','low'))

Where modality[,c(2:5,8)] is the data you pass into the function, so chose the columns you want by changing the values in c(2:5,8). If you would like it done on every column then apply(modality, ... would suffice.

Here apply( ,2 , ) the 2 indicates that we want to run our function column-by-column. 1 would mean row-by-row.

Upvotes: 1

Ronak Shah
Ronak Shah

Reputation: 389235

You can use :

library(dplyr)
modality %>% 
    mutate_at(vars(auditory:visual), ~ifelse(. > mean(.), "higher", "lower"))

Upvotes: 1

Related Questions