Reputation: 21
I need your help please to solve the following. I have a dataframe as shown below, with over 100,000 rows approx.
ID Date Value Type
AAA 01/01/20 10 0
BBB 26/01/20 10 0
AAA 05/02/20 80 0
AAA 06/02/20 90 1
BBB 21/02/20 60 1
AAA 08/02/20 10 0
AAA 10/03/20 80 1
I need a function or some code that allows me to add the column "value" until the date in which appears the "first" 1 in the column "type" and group them later, it should be as follows:
ID Value
AAA 180
BBB 70
The sum of "AAA" is 180 since it added up to find the date (06/02/20) where the "1" appears in the "Type" column.
In other words, it is to make a sum of a column up to a certain specific date, which I can indicate with the value of another column.
Thank you very much for your help. This is my first question in the community.
If you have an alternative answer, obviating some requirements, I also appreciate it.
Upvotes: 0
Views: 289
Reputation: 388972
You can use match
to get first occurrence of 1 in Type
column and sum
Value
column until then.
library(dplyr)
df %>%
group_by(ID) %>%
summarise(Value = sum(Value[seq_len(match(1, Type))]))
# ID Value
# <chr> <int>
#1 AAA 180
#2 BBB 70
data
df <- structure(list(ID = c("AAA", "BBB", "AAA", "AAA", "BBB", "AAA",
"AAA"), Date = c("01/01/20", "26/01/20", "05/02/20", "06/02/20",
"21/02/20", "08/02/20", "10/03/20"), Value = c(10L, 10L, 80L,
90L, 60L, 10L, 80L), Type = c(0L, 0L, 0L, 1L, 1L, 0L, 1L)),
class = "data.frame", row.names = c(NA, -7L))
Upvotes: 3