Reputation: 1105
So I have a data frame , simplified as this:
ID A B C
1 1 5 0
2 3 0 3
3 0 2 1
2 5 9 1
3 3 5 3
1 2 6 4
Simply put, I want to calculate the following for each row:
Easy enough, but the hard part for me is after taking each, how do I create a mean value to represent each ID.
So after I get these values, how do I show the AVERAGE MEAN/MED/MAX/MIN for each ID???
Expected output:
(1)
ID Mean Median Min Max
1 2 1 0 5
2 2 3 0 3
3 1 1 0 2
2 5 5 1 9
3 3.66 3 3 5
1 4 4 2 6
(2)
ID AvgMean AvgMedian AvgMin AvgMax
1 3 2.5 1 5.5
2 3.5 4 1 6
3 2.33 3 3 3.5
Upvotes: 1
Views: 2179
Reputation: 9485
You can try something like this:
library(dplyr)
df %>%
group_by(ID) %>%
summarise(mean_ = mean(c_across(A:C), na.rm = T),
medi_ = median(c_across(A:C), na.rm = T),
max_ = max(c_across(A:C), na.rm = T),
min_ = min(c_across(A:C), na.rm = T))
`summarise()` ungrouping output (override with `.groups` argument)
# A tibble: 3 x 5
ID mean_ medi_ max_ min_
<int> <dbl> <dbl> <int> <int>
1 1 3 3 6 0
2 2 3.5 3 9 0
3 3 2.33 2.5 5 0
For the second part:
df %>%
rowwise() %>%
summarise(mean_ = mean(c_across(A:C), na.rm = T),
medi_ = median(c_across(A:C), na.rm = T),
max_ = max(c_across(A:C), na.rm = T),
min_ = min(c_across(A:C), na.rm = T))
`summarise()` ungrouping output (override with `.groups` argument)
# A tibble: 6 x 4
mean_ medi_ max_ min_
<dbl> <int> <int> <int>
1 2 1 5 0
2 2 3 3 0
3 1 1 2 0
4 5 5 9 1
5 3.67 3 5 3
6 4 4 6 2
With data:
df <- structure(list(ID = c(1L, 2L, 3L, 2L, 3L, 1L), A = c(1L, 3L,
0L, 5L, 3L, 2L), B = c(5L, 0L, 2L, 9L, 5L, 6L), C = c(0L, 3L,
1L, 1L, 3L, 4L)), class = "data.frame", row.names = c(NA, -6L
))
Upvotes: 4
Reputation: 76450
In base R, the following seems to do what the question asks for.
out
is a data.frame of the statistics ungrouped and out2
of the grouped ones.
fun <- function(X){
f <- function(x, na.rm = FALSE){
c(
Mean = mean(x, na.rm = na.rm),
Median = median(x, na.rm = na.rm),
Min = min(x, na.rm = na.rm),
Max = max(x, na.rm = na.rm)
)
}
t(apply(X, 1, f))
}
out <- lapply(split(df1[-1], df1$ID), fun)
out2 <- lapply(out, colMeans)
out <- do.call(rbind, out)
out <- cbind.data.frame(ID = row.names(out), out)
out2 <- cbind.data.frame(ID = names(out2), do.call(rbind, out2))
out
# ID Mean Median Min Max
#1 1 2.000000 1 0 5
#6 6 4.000000 4 2 6
#2 2 2.000000 3 0 3
#4 4 5.000000 5 1 9
#3 3 1.000000 1 0 2
#5 5 3.666667 3 3 5
out2
# ID Mean Median Min Max
#1 1 3.000000 2.5 1.0 5.5
#2 2 3.500000 4.0 0.5 6.0
#3 3 2.333333 2.0 1.5 3.5
Upvotes: 0
Reputation: 757
thanks for posting expected output. I would consider using summarize and across together
library(dplyr)
df <- df %>%
group_by(ID)
summarize(across(2:4, mean))
Upvotes: 0