Reputation: 335
I would like to sum a column (by ID) depending on another variable (group). If we take for instance:
ID t group
1 12 1
1 14 1
1 2 6
2 0.5 7
2 12 1
3 3 1
4 2 4
I'd like to sum values of column t separately for each ID only if group==1, and obtain:
ID t group sum
1 12 1 26
1 14 1 26
1 2 6 NA
2 0.5 7 NA
2 12 1 12
3 3 1 3
4 2 4 NA
Upvotes: 0
Views: 219
Reputation: 887831
We can use data.table
methods. Convert the 'data.frame' to 'data.table' (setDT(df)
), grouped by 'ID', specify the i
with the logical expression group ==1
, get the sum
of 't' and assign (:=
) it to 'new'. By default, other rows are assigned to NA
by default
library(data.table)
setDT(df)[group == 1, new := sum(t), ID]
df
# ID t group new
#1: 1 12.0 1 26
#2: 1 14.0 1 26
#3: 1 2.0 6 NA
#4: 2 0.5 7 NA
#5: 2 12.0 1 12
#6: 3 3.0 1 3
#7: 4 2.0 4 NA
df <- structure(list(ID = c(1L, 1L, 1L, 2L, 2L, 3L, 4L), t = c(12,
14, 2, 0.5, 12, 3, 2), group = c(1L, 1L, 6L, 7L, 1L, 1L, 4L)),
class = "data.frame", row.names = c(NA,
-7L))
Upvotes: 0
Reputation: 107747
Consider base R with ifelse
and ave()
for conditional inline aggregation.
df$sum <- with(df, ifelse(group == 1, ave(t, ID, group, FUN=sum), NA))
df
# ID t group sum
# 1 1 12.0 1 26
# 2 1 14.0 1 26
# 3 1 2.0 6 NA
# 4 2 0.5 7 NA
# 5 2 12.0 1 12
# 6 3 3.0 1 3
# 7 4 2.0 4 NA
Upvotes: 1
Reputation: 51592
Using dplyr
,
df %>%
group_by(ID) %>%
mutate(new = sum(t[group == 1]),
new = replace(new, group != 1, NA))
which gives,
# A tibble: 7 x 4 # Groups: ID [4] ID t group new <int> <dbl> <int> <dbl> 1 1 12 1 26 2 1 14 1 26 3 1 2 6 NA 4 2 0.5 7 NA 5 2 12 1 12 6 3 3 1 3 7 4 2 4 NA
Upvotes: 1