Pake
Pake

Reputation: 1118

How can I transform columns into a sum conditional on the values in another numeric column?

I have a dataframe designed to compare tickets between two years.

df<-data.frame(year=c("1","2","1","2","1","2"),
           Daysleft=c(8,8,7,7,6,6),
           Count_TypeA=c(1,1,1,1,1,1),
           Count_TypeB=c(5,2,5,2,5,2))

In this example data frame, days left indicates days remaining until the event. What I'd like to do is come up with the cumulative counts where Daysleft >= the Daysleft value, grouped by year. The desired output is below:

desired<-data.frame(year=c("1","2","1","2","1","2"),
                Daysleft=c(8,8,7,7,6,6),
                Count_TypeA=c(1,1,2,2,3,3),
                Count_TypeB=c(5,2,10,4,15,6))

In my real data, there are many Count_Type variables, so ideally I'm hoping for an answer that allows me to avoid naming all the columns using their typed names.

For a quick view of desired output:

  year Daysleft Count_TypeA Count_TypeB
    1        8           1           5
    2        8           1           2
    1        7           2          10
    2        7           2           4
    1        6           3          15
    2        6           3           6

I am guessing there is a trick with dplyr's group_by() that could help, but answers found for similar questions (like Using dplyr to get cumulative count by group) aren't clear to me.

df2<-df %>%
   group_by(year)%>%

Upvotes: 0

Views: 132

Answers (2)

akash87
akash87

Reputation: 3994

If you need to do each column independently, then you can use:

df %>% group_by(year) %>% mutate_at(vars(-year, -Daysleft),cumsum)

Upvotes: 1

maarvd
maarvd

Reputation: 1284

I think the cumsum function works here.

desired <- df %>% group_by(year) %>% mutate(Count_TypeB = cumsum(Count_TypeB)) %>% mutate(Count_TypeA = cumsum(Count_TypeA))

Upvotes: 0

Related Questions