Reputation: 796
I have this data frame here
df = structure(list(D = c(-76, -74, -72, -70, -44, -42), A = c(83,
83, 82, 82, 81, 81), B = c(-0.613, -0.4,-0.5, -0.68, -0.13, -0.26)), row.names =
c(NA, 6L), class = "data.frame")
I would like to compute the mean of all values in B that have similar value in A.
for instance -0.613 and -0.4 as they correspond to the same values 83 etc
I can simply do this :
df$Bmean <- with(df, ave(B, A))
However this only for B. I need to do the same thing for all columns (B,D,etc.) in df
Upvotes: 1
Views: 169
Reputation: 39613
You could use this approach
library(dplyr)
#Approach 1
df %>% group_by(A) %>% mutate_all(mean,na.rm=T)
# A tibble: 6 x 3
# Groups: A [3]
D A B
<dbl> <dbl> <dbl>
1 -75 83 -0.506
2 -75 83 -0.506
3 -71 82 -0.59
4 -71 82 -0.59
5 -43 81 -0.195
6 -43 81 -0.195
#Approach 2
df %>% group_by(A) %>% summarise_all(mean,na.rm=T)
# A tibble: 3 x 3
A D B
<dbl> <dbl> <dbl>
1 81 -43 -0.195
2 82 -71 -0.59
3 83 -75 -0.506
Upvotes: 1
Reputation: 887951
We can use mutate
with across
from dplyr
for multiple columns
library(dplyr) # 1.0.0
df %>%
group_by(A) %>%
mutate(across(everything(), list(mean = ~ mean(.))))
If it is to replace original column with mean
df %>%
group_by(A) %>%
mutate(across(everything(), mean, na.rm = TRUE))
NOTE: na.rm = TRUE
is added in case there are any NA values as by default it is na.rm = FALSE
Or to have fine control over the column names
df1 <- df %>%
group_by(A) %>%
mutate(across(everything(), list(mean = ~ mean(.)), .names = "{col}mean"))
df1
# A tibble: 6 x 5
# Groups: A [3]
# D A B Dmean Bmean
# <dbl> <dbl> <dbl> <dbl> <dbl>
#1 -76 83 -0.613 -75 -0.506
#2 -74 83 -0.4 -75 -0.506
#3 -72 82 -0.5 -71 -0.59
#4 -70 82 -0.68 -71 -0.59
#5 -44 81 -0.13 -43 -0.195
#6 -42 81 -0.26 -43 -0.195
Or using ave
for multiple columns, get the vector of column names that are not the grouping ("A" with setdiff
('nm1'), Loop over the vector, subset the dataset column, use that in ave
and assign it back to the dataset as new columns with paste
nm1 <- setdiff(names(df), "A")
df[paste0(nm1, "mean")] <- lapply(nm1, function(nm) ave(df[[nm]], df$A))
Upvotes: 1