Dr. Flow
Dr. Flow

Reputation: 486

Group according to cumulative sums

I want to make groups according to the accumulated sum. in my tiny example, the two first numbers in the column value sums to exactly 100 and is group A. The three next numbers sum to less than 100 and go to group B, while the last number exceeds 100 and gets its own group C - and so forth.

input <- data.frame(id=c(1:6),
                   value =c(99, 1,  33, 33, 33, 150))
input

desired_output <- data.frame(id=c(1:6),
              value =c(99, 1,  33, 33, 33, 150),
              group= c("A", "A", "B", "B", "B", "C"))

desired_output

Thank you

Upvotes: 1

Views: 117

Answers (2)

Ma&#235;l
Ma&#235;l

Reputation: 52399

Two possible one-liners, with purrr::accumulate and with MESS::cumsumbinning:

purrr::accumulate

library(tidyverse)
group_by(input, grp = LETTERS[cumsum(value == accumulate(value, ~ ifelse(.x + .y <= 100, .x + .y, .y)))])

MESS::cumsumbinning

library(dplyr)
group_by(input, grp = LETTERS[MESS::cumsumbinning(value, 100)])

output

# A tibble: 6 x 3
# Groups:   grp [3]
     id value grp  
  <int> <dbl> <chr>
1     1    99 A    
2     2     1 A    
3     3    33 B    
4     4    33 B    
5     5    33 B    
6     6   150 C    

Upvotes: 2

Yuriy Saraykin
Yuriy Saraykin

Reputation: 8880

desired_output <- data.frame(id=c(1:6),
                             value =c(99, 1,  33, 33, 33, 150),
                             group= c("A", "A", "B", "B", "B", "C"))

library(tidyverse)

desired_output %>% 
  group_by(grp_new = cumsum(value - 1) %/% 100) %>% 
  mutate(grp_new = LETTERS[cur_group_id()]) %>% 
  ungroup() 
#> # A tibble: 6 x 4
#>      id value group grp_new
#>   <int> <dbl> <chr> <chr>  
#> 1     1    99 A     A      
#> 2     2     1 A     A      
#> 3     3    33 B     B      
#> 4     4    33 B     B      
#> 5     5    33 B     B      
#> 6     6   150 C     C

Created on 2022-04-15 by the reprex package (v2.0.1)

Upvotes: 1

Related Questions