Harold
Harold

Reputation: 383

Conditional sum of Data Table by key in R

I have a data table 'df' with 3 columns. id , meal , time Each id has many rows.

library(data.table)
id = c(1,1,2,2,3,3)
meal = c(1,1,0,0,1,0)
time = c(10,9,12,13,7,15)
df <- data.table(id, meal, time)

> df
   id meal time
1:  1    1   10
2:  1    1    9
3:  2    0   12
4:  2    0   13
5:  3    1    7
6:  3    0   15

Now, I want to calculate the sum of time when meal == 1 of each ID. My current code is:

df[meal == 1, sum(time), by = "id"]

However, this code excute meal == 1 first, so when some id have no record with meal == 1, it would be omit instead of return 0.

id 2 is omitted here.

   id V1
1:  1 19
2:  3  7

What can I do?

Upvotes: 0

Views: 1572

Answers (2)

Ben
Ben

Reputation: 21625

How about this?

library(data.table)

dt <- data.table(
  id = c(1, 1, 2, 2, 3, 3),
  meal = c(1, 5, 3, 2, 10, 1),
  time = c(2, 10, 15, 5, 5, 2)
)

   id meal time
1:  1    1    2
2:  1    5   10
3:  2    3   15
4:  2    2    5
5:  3   10    5
6:  3    1    2

dt[, list(Meal1Time = sum(time[which(meal == 1)])), by=id]
   id Meal1Time
1:  1         2
2:  2         0
3:  3         2

Note that this is not the most efficient method, but unless your working with millions of rows of data, it should run almost instantaneously.

Upvotes: 0

Bulat
Bulat

Reputation: 6969

You can try something like this:

df[, is.meal.one := as.integer(meal == 1)]
df[, sum(time * is.meal.one), by = "id"]

Upvotes: 1

Related Questions