Reputation: 11
I need to calculate mean values of selected rows of a data frame with the same index. In detail, I need the mean value of TOC for all LB <= 30 for the same PointID (alltogether 3400 PointIDs). The data frame looks like this:
PointId LB TOC
2 10 21,83
2 30 20,75
2 50 11,51
2 70 6,91
2 100 1,57
3 10 20,64
3 30 15,89
3 50 5,74
3 70 5,79
3 100 5,81
4 10 19,15
4 30 15,17
4 50 22,2
4 70 10,28
4 100 6,52
.....
Any help is greatly appreciated!
Upvotes: 1
Views: 1509
Reputation: 9485
Something like this in dplyr
:
library(dplyr)
df %>%
filter(LB <= 30) %>%
group_by(PointId) %>%
summarise(mean_ = mean(TOC))
# A tibble: 3 x 2
PointId mean_
<int> <dbl>
1 2 21.3
2 3 18.3
3 4 17.2
Or with data.table
:
library(data.table)
data.table(df)[LB <= 30, .(mean_ = mean(TOC)), by = PointId]
PointId mean_
1: 2 21.290
2: 3 18.265
3: 4 17.160
Or base R:
aggregate(TOC ~ PointId, data = df[df$LB <= 30,], mean)
PointId TOC
1 2 21.290
2 3 18.265
3 4 17.160
With data:
structure(list(PointId = c(2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L,
3L, 4L, 4L, 4L, 4L, 4L), LB = c(10L, 30L, 50L, 70L, 100L, 10L,
30L, 50L, 70L, 100L, 10L, 30L, 50L, 70L, 100L), TOC = c(21.83,
20.75, 11.51, 6.91, 1.57, 20.64, 15.89, 5.74, 5.79, 5.81, 19.15,
15.17, 22.2, 10.28, 6.52)), class = "data.frame", row.names = c(NA,
-15L))
Upvotes: 1
Reputation: 1241
with data.table
package is really easy
library(data.table)
library(dplyr)
dt <- dt %>% as.data.table()
dt[LB <= 30, mean(TOC), by = PointId]
PointId V1
1: 2 21.290
2: 3 18.265
3: 4 17.160
Upvotes: 0
Reputation: 805
You can filter
first based on LB
and then use summarise
df %>% group_by(PointId) %>%
filter(LB <= 30) %>%
summarise(Mean = mean(TOC, na.rm = TRUE))
# A tibble: 3 x 2
PointId Mean
<int> <dbl>
1 2 2129
2 3 1826.
3 4 1716
Upvotes: 0