Reputation: 99
My dataset looks like this:
values<-c(9,8,NA,8)
acceptance<-c(8,8,NA,6)
diffusion<-c(9,8,7,NA)
attitudes<-c(7,7,6,NA)
df<-data.frame(values,acceptance,diffusion,attitudes)
values acceptance diffusion attitudes
9 8 6 8
8 8 8 7
NA NA 7 6
8 6 NA NA
I can use the code mean(df$values, na.rm = T)
to get the mean for each column, but I'd like to create a new variable for the total mean of specific columns (ex: values & acceptance). I could just use the same code for each column I want to include and then figure out the mean that way:
mean(df$values, na.rm = T) = 8.333
mean(df$acceptance, na.rm = T) = 7.333
(8.333 + 7.333)/2 = 7.833
df$values_acceptance<-7.833
But this would be really inefficient because I have multiple variables I need to include. I'm sure there is a much easier way to do this, but I'm still getting used to R.
Thanks in advance!
Upvotes: 4
Views: 4726
Reputation: 596
You could use mapply
df2 <- mapply(mean,df,na.rm=T)
Which produces
values acceptance diffusion attitudes
8.333333 7.333333 8.000000 6.666667
Or if you want to mean of each column mean you could go
mean(mapply(mean,df,na.rm=T))
[1] 7.583333
If you need to specify specific columns you can do the following
mean(mapply(mean,df[c(1:2)],na.rm=T))
[1] 7.833333
and use na.rm twice if one of the columns means will have a NA
mean(mapply(mean,df[c(1:2)],na.rm=T),na.rm=T)
Upvotes: 0
Reputation: 94
Just use c()
to combine the columns you want calculate the total mean:
df %>% mutate(new=mean(c(values,acceptance),na.rm = T))
values acceptance diffusion attitudes new
1 9 8 9 7 7.833333
2 8 8 8 7 7.833333
3 NA NA 7 6 7.833333
4 8 6 NA NA 7.833333
Upvotes: 1
Reputation: 15123
You may approach this way
df %>%
select(values, acceptance) %>%
reshape2::melt() %>%
summarise(n = mean(value, na.rm = TRUE))
result is like
n
1 7.833333
Upvotes: 0
Reputation: 886938
We can use colMeans
on the selected columns and get the mean
of it, then assign the output to create new column (no packages are needed)
df$values_acceptance<- mean(colMeans(df[c('values', 'acceptance')], na.rm = TRUE))
-output
> df
values acceptance diffusion attitudes values_acceptance
1 9 8 9 7 7.833333
2 8 8 8 7 7.833333
3 NA NA 7 6 7.833333
4 8 6 NA NA 7.833333
Or if we need dplyr
library(dplyr)
df %>%
mutate(values_acceptance = mean(unlist(across(c(values,
acceptance), mean, na.rm = TRUE))))
-output
values acceptance diffusion attitudes values_acceptance
1 9 8 9 7 7.833333
2 8 8 8 7 7.833333
3 NA NA 7 6 7.833333
4 8 6 NA NA 7.833333
Upvotes: 2