Reputation: 383
I have a data table 'df' with 3 columns. id , meal , time Each id has many rows.
library(data.table)
id = c(1,1,2,2,3,3)
meal = c(1,1,0,0,1,0)
time = c(10,9,12,13,7,15)
df <- data.table(id, meal, time)
> df
id meal time
1: 1 1 10
2: 1 1 9
3: 2 0 12
4: 2 0 13
5: 3 1 7
6: 3 0 15
Now, I want to calculate the sum of time when meal == 1 of each ID. My current code is:
df[meal == 1, sum(time), by = "id"]
However, this code excute meal == 1 first, so when some id have no record with meal == 1, it would be omit instead of return 0.
id 2 is omitted here.
id V1
1: 1 19
2: 3 7
What can I do?
Upvotes: 0
Views: 1572
Reputation: 21625
How about this?
library(data.table)
dt <- data.table(
id = c(1, 1, 2, 2, 3, 3),
meal = c(1, 5, 3, 2, 10, 1),
time = c(2, 10, 15, 5, 5, 2)
)
id meal time
1: 1 1 2
2: 1 5 10
3: 2 3 15
4: 2 2 5
5: 3 10 5
6: 3 1 2
dt[, list(Meal1Time = sum(time[which(meal == 1)])), by=id]
id Meal1Time
1: 1 2
2: 2 0
3: 3 2
Note that this is not the most efficient method, but unless your working with millions of rows of data, it should run almost instantaneously.
Upvotes: 0
Reputation: 6969
You can try something like this:
df[, is.meal.one := as.integer(meal == 1)]
df[, sum(time * is.meal.one), by = "id"]
Upvotes: 1