lfvalencia
lfvalencia

Reputation: 21

Group and add values of a column up to a certain date. R

I need your help please to solve the following. I have a dataframe as shown below, with over 100,000 rows approx.

    ID     Date    Value    Type
   AAA   01/01/20    10     0    
   BBB   26/01/20    10     0
   AAA   05/02/20    80     0 
   AAA   06/02/20    90     1
   BBB   21/02/20    60     1 
   AAA   08/02/20    10     0
   AAA   10/03/20    80     1

I need a function or some code that allows me to add the column "value" until the date in which appears the "first" 1 in the column "type" and group them later, it should be as follows:

    ID    Value
   AAA     180          
   BBB      70 

The sum of "AAA" is 180 since it added up to find the date (06/02/20) where the "1" appears in the "Type" column.

In other words, it is to make a sum of a column up to a certain specific date, which I can indicate with the value of another column.

Thank you very much for your help. This is my first question in the community.

If you have an alternative answer, obviating some requirements, I also appreciate it.

Upvotes: 0

Views: 289

Answers (1)

Ronak Shah
Ronak Shah

Reputation: 388972

You can use match to get first occurrence of 1 in Type column and sum Value column until then.

library(dplyr)

df %>%
  group_by(ID) %>%
  summarise(Value = sum(Value[seq_len(match(1, Type))]))

#  ID    Value
#  <chr> <int>
#1 AAA     180
#2 BBB      70

data

df <- structure(list(ID = c("AAA", "BBB", "AAA", "AAA", "BBB", "AAA", 
"AAA"), Date = c("01/01/20", "26/01/20", "05/02/20", "06/02/20", 
"21/02/20", "08/02/20", "10/03/20"), Value = c(10L, 10L, 80L, 
90L, 60L, 10L, 80L), Type = c(0L, 0L, 0L, 1L, 1L, 0L, 1L)), 
class = "data.frame", row.names = c(NA, -7L))

Upvotes: 3

Related Questions