Reputation: 335

sum for each ID depending on another variable

I would like to sum a column (by ID) depending on another variable (group). If we take for instance:

ID      t        group   
 1      12         1
 1      14         1
 1      2          6
 2      0.5        7
 2      12         1
 3      3          1
 4      2          4

I'd like to sum values of column t separately for each ID only if group==1, and obtain:

ID      t        group     sum
 1      12         1        26
 1      14         1        26
 1      2          6        NA
 2      0.5        7        NA
 2      12         1        12
 3      3          1         3
 4      2          4        NA

Upvotes: 0

Answers (3)

akrun

Reputation: 887831

We can use data.table methods. Convert the 'data.frame' to 'data.table' (setDT(df)), grouped by 'ID', specify the i with the logical expression group ==1, get the sum of 't' and assign (:=) it to 'new'. By default, other rows are assigned to NA by default

library(data.table)
setDT(df)[group == 1,  new := sum(t), ID]
df
#   ID    t group new
#1:  1 12.0     1  26
#2:  1 14.0     1  26
#3:  1  2.0     6  NA
#4:  2  0.5     7  NA
#5:  2 12.0     1  12
#6:  3  3.0     1   3
#7:  4  2.0     4  NA

data

df <- structure(list(ID = c(1L, 1L, 1L, 2L, 2L, 3L, 4L), t = c(12, 
14, 2, 0.5, 12, 3, 2), group = c(1L, 1L, 6L, 7L, 1L, 1L, 4L)),
class = "data.frame", row.names = c(NA, 
-7L))

Upvotes: 0

Parfait

Reputation: 107747

Consider base R with ifelse and ave() for conditional inline aggregation.

df$sum <- with(df, ifelse(group == 1, ave(t, ID, group, FUN=sum), NA))

df

#   ID    t group sum
# 1  1 12.0     1  26
# 2  1 14.0     1  26
# 3  1  2.0     6  NA
# 4  2  0.5     7  NA
# 5  2 12.0     1  12
# 6  3  3.0     1   3
# 7  4  2.0     4  NA

Rextester demo

Upvotes: 1

Sotos

Reputation: 51592

Using dplyr,

df %>% 
 group_by(ID) %>% 
 mutate(new = sum(t[group == 1]), 
        new = replace(new, group != 1, NA))

which gives,

# A tibble: 7 x 4
# Groups:   ID [4]
     ID     t group   new
  <int> <dbl> <int> <dbl>
1     1  12       1    26
2     1  14       1    26
3     1   2       6    NA
4     2   0.5     7    NA
5     2  12       1    12
6     3   3       1     3
7     4   2       4    NA

Upvotes: 1

sum for each ID depending on another variable

Answers (3)

data

Related Questions