ads91
ads91

Reputation: 11

Cumulative sum of consecutive values

I have a simple dataset with US NBER recessions that are currently coded as dummy variables. I would like to sequentially label each recession. For example, in the below table, I want the recession column to read "Recession 1", "No Recession", "Recession 2", and so on, thereby categorizing each recession.

Date    Recession 
1949-06-30 1
1949-09-30 1
1949-12-31 1
1950-03-31 0
1950-06-30 0
1953-09-30 1
1953-12-31 1

Upvotes: 0

Views: 516

Answers (5)

Rui Barradas
Rui Barradas

Reputation: 76402

Here is a cumsum trick.

x <- c(1, 1, 1, 0, 0, 1, 1)
i <- cumsum(c(1, diff(x) != 0) & as.logical(x))
ifelse(x == 0, "No Recession", paste("Recession", i))
#[1] "Recession 1"  "Recession 1"  "Recession 1"  "No Recession"
#[5] "No Recession" "Recession 2"  "Recession 2"

Upvotes: 2

Calum You
Calum You

Reputation: 15062

Here's a tidyverse approach:

  1. Use lag to figure out if the status of recession changed
  2. Use & and cumsum to figure out if it changed from no recession to recession
  3. Replace all the rows that should be "Recession" with "No Recession" with if_else
library(tidyverse)
df <- read_table2(
"Date  Recession
1949-06-30 1
1949-09-30 1
1949-12-31 1
1950-03-31 0
1950-06-30 0
1953-09-30 1
1953-12-31 1"
)

df %>%
  mutate(
    changed = Recession != lag(Recession, default = Recession[1]),
    to_recession = str_c("Recession ", cumsum(changed & as.logical(Recession)) + 1),
    Recession_Num = if_else(Recession == 1, to_recession, "No Recession")
    ) %>%
  select(-changed, -to_recession)
#> # A tibble: 7 x 3
#>   Date       Recession Recession_Num
#>   <date>         <int> <chr>        
#> 1 1949-06-30         1 Recession 1  
#> 2 1949-09-30         1 Recession 1  
#> 3 1949-12-31         1 Recession 1  
#> 4 1950-03-31         0 No Recession 
#> 5 1950-06-30         0 No Recession 
#> 6 1953-09-30         1 Recession 2  
#> 7 1953-12-31         1 Recession 2

Created on 2018-10-30 by the reprex package (v0.2.1)

Upvotes: 2

pogibas
pogibas

Reputation: 28309

You can use rle to count consecutive runs of 1 and repeat (rep) them corresponding number of times (lengths)

foo <- with(rle(input$Recession), rep(cumsum(values) * values, lengths))
ifelse(foo == 0, "No Recession", paste("Recession", foo))

Input:

structure(list(Date = c("1949-06-30", "1949-09-30", "1949-12-31", 
"1950-03-31", "1950-06-30", "1953-09-30", "1953-12-31"), Recession = c(1L, 
1L, 1L, 0L, 0L, 1L, 1L)), row.names = c(NA, -7L), class = "data.frame")

Upvotes: 3

Muffindorf
Muffindorf

Reputation: 103

Date <- as.Date(c('1949-06-30', '1949-09-30', '1949-12-31', '1950-03-31', '1950-06-30', '1953-09-30', '1953-12-31'), 
                format = '%Y-%m-%d')
Recession <- c(1,1,1,0,0,1,1)

df <- data.frame(Date, Recession)

find_seq_1s <- function(x) {
        count <-  0
        in_seq <-  FALSE
        output <-  NULL
        for(i in x) {
                if(i == 1 && in_seq == FALSE) {
                        count <- count + 1
                        in_seq <-  TRUE
                        output <-  c(output, paste('Recession', as.character(count)))
                } else if(i == 1 && in_seq == TRUE) {
                        output <-  c(output, paste('Recession', as.character(count)))
                } else {
                        in_seq <-  FALSE
                        output <-  c(output, 'No Recession')
                }
        }
        return(output)
}

df$Rec_Seq <- find_seq_1s(df$Recession)

Upvotes: 1

T. Ciffr&#233;o
T. Ciffr&#233;o

Reputation: 126

unlist(lapply(1:nrow(df), FUN = function(x) ifelse(df$recession[x]==1, paste("Recession", x), "No Recession")))

Upvotes: 0

Related Questions