dave_in_newengland
dave_in_newengland

Reputation: 251

Using intervals to identify if a column of dates is within a vector of other dates

I was trying to take a vector of dates (recession dates), create an interval, and then use this interval to create a new column in an existing data frame indicating whether a date fell within a recession period.

I am sure there are multiple ways to do this, but the way I was trying did not work and I wasn't sure how to fix. I was trying to use this approach but to no avail.

Reproducible Example

library(tidyverse)
library(lubridate)


# dput data
recession_dates <- structure(list(start = structure(c(4199, 7486, 11382, 13848), class = "Date"), 
                           stop = structure(c(4687, 7729, 11627, 14396), class = "Date")), class = c("tbl_df", 
                                                                                                     "tbl", "data.frame"), row.names = c(NA, -4L))


my_df <- structure(list(Date = structure(c(3742, 3833, 3925, 4017, 4107, 
                                             4198, 4290, 4382, 4472, 4563, 4655, 4747, 4837, 4928, 5020, 5112, 
                                             5203, 5294, 5386, 5478, 5568, 5659, 5751, 5843, 5933, 6024, 6116, 
                                             6208, 6298, 6389, 6481, 6573, 6664, 6755, 6847, 6939, 7029, 7120, 
                                             7212, 7304, 7394, 7485, 7577, 7669, 7759, 7850, 7942, 8034, 8125, 
                                             8216, 8308, 8400, 8490, 8581, 8673, 8765, 8855, 8946, 9038, 9130, 
                                             9220, 9311, 9403, 9495, 9586, 9677, 9769, 9861, 9951, 10042, 
                                             10134, 10226, 10316, 10407, 10499, 10591, 10681, 10772, 10864, 
                                             10956, 11047, 11138, 11230, 11322, 11412, 11503, 11595, 11687, 
                                             11777, 11868, 11960, 12052, 12142, 12233, 12325, 12417, 12508, 
                                             12599, 12691, 12783, 12873, 12964, 13056, 13148, 13238, 13329, 
                                             13421, 13513, 13603, 13694, 13786, 13878, 13969, 14060, 14152, 
                                             14244, 14334, 14425, 14517, 14609, 14699, 14790, 14882, 14974, 
                                             15064, 15155, 15247, 15339, 15430, 15521, 15613, 15705, 15795, 
                                             15886, 15978, 16070, 16160, 16251, 16343, 16435, 16525, 16616, 
                                             16708, 16800, 16891, 16982, 17074, 17166, 17256, 17347, 17439, 
                                             17531, 17621, 17712, 17804, 17896, 17986, 18077, 18169), class = "Date"), 
                          Asset = c("real_gdp_usd2012", "real_gdp_usd2012", "real_gdp_usd2012", 
                                    "real_gdp_usd2012", "real_gdp_usd2012", "real_gdp_usd2012", 
                                    "real_gdp_usd2012", "real_gdp_usd2012", "real_gdp_usd2012", 
                                    "real_gdp_usd2012", "real_gdp_usd2012", "real_gdp_usd2012", 
                                    "real_gdp_usd2012", "real_gdp_usd2012", "real_gdp_usd2012", 
                                    "real_gdp_usd2012", "real_gdp_usd2012", "real_gdp_usd2012", 
                                    "real_gdp_usd2012", "real_gdp_usd2012", "real_gdp_usd2012", 
                                    "real_gdp_usd2012", "real_gdp_usd2012", "real_gdp_usd2012", 
                                    "real_gdp_usd2012", "real_gdp_usd2012", "real_gdp_usd2012", 
                                    "real_gdp_usd2012", "real_gdp_usd2012", "real_gdp_usd2012", 
                                    "real_gdp_usd2012", "real_gdp_usd2012", "real_gdp_usd2012", 
                                    "real_gdp_usd2012", "real_gdp_usd2012", "real_gdp_usd2012", 
                                    "real_gdp_usd2012", "real_gdp_usd2012", "real_gdp_usd2012", 
                                    "real_gdp_usd2012", "real_gdp_usd2012", "real_gdp_usd2012", 
                                    "real_gdp_usd2012", "real_gdp_usd2012", "real_gdp_usd2012", 
                                    "real_gdp_usd2012", "real_gdp_usd2012", "real_gdp_usd2012", 
                                    "real_gdp_usd2012", "real_gdp_usd2012", "real_gdp_usd2012", 
                                    "real_gdp_usd2012", "real_gdp_usd2012", "real_gdp_usd2012", 
                                    "real_gdp_usd2012", "real_gdp_usd2012", "real_gdp_usd2012", 
                                    "real_gdp_usd2012", "real_gdp_usd2012", "real_gdp_usd2012", 
                                    "real_gdp_usd2012", "real_gdp_usd2012", "real_gdp_usd2012", 
                                    "real_gdp_usd2012", "real_gdp_usd2012", "real_gdp_usd2012", 
                                    "real_gdp_usd2012", "real_gdp_usd2012", "real_gdp_usd2012", 
                                    "real_gdp_usd2012", "real_gdp_usd2012", "real_gdp_usd2012", 
                                    "real_gdp_usd2012", "real_gdp_usd2012", "real_gdp_usd2012", 
                                    "real_gdp_usd2012", "real_gdp_usd2012", "real_gdp_usd2012", 
                                    "real_gdp_usd2012", "real_gdp_usd2012", "real_gdp_usd2012", 
                                    "real_gdp_usd2012", "real_gdp_usd2012", "real_gdp_usd2012", 
                                    "real_gdp_usd2012", "real_gdp_usd2012", "real_gdp_usd2012", 
                                    "real_gdp_usd2012", "real_gdp_usd2012", "real_gdp_usd2012", 
                                    "real_gdp_usd2012", "real_gdp_usd2012", "real_gdp_usd2012", 
                                    "real_gdp_usd2012", "real_gdp_usd2012", "real_gdp_usd2012", 
                                    "real_gdp_usd2012", "real_gdp_usd2012", "real_gdp_usd2012", 
                                    "real_gdp_usd2012", "real_gdp_usd2012", "real_gdp_usd2012", 
                                    "real_gdp_usd2012", "real_gdp_usd2012", "real_gdp_usd2012", 
                                    "real_gdp_usd2012", "real_gdp_usd2012", "real_gdp_usd2012", 
                                    "real_gdp_usd2012", "real_gdp_usd2012", "real_gdp_usd2012", 
                                    "real_gdp_usd2012", "real_gdp_usd2012", "real_gdp_usd2012", 
                                    "real_gdp_usd2012", "real_gdp_usd2012", "real_gdp_usd2012", 
                                    "real_gdp_usd2012", "real_gdp_usd2012", "real_gdp_usd2012", 
                                    "real_gdp_usd2012", "real_gdp_usd2012", "real_gdp_usd2012", 
                                    "real_gdp_usd2012", "real_gdp_usd2012", "real_gdp_usd2012", 
                                    "real_gdp_usd2012", "real_gdp_usd2012", "real_gdp_usd2012", 
                                    "real_gdp_usd2012", "real_gdp_usd2012", "real_gdp_usd2012", 
                                    "real_gdp_usd2012", "real_gdp_usd2012", "real_gdp_usd2012", 
                                    "real_gdp_usd2012", "real_gdp_usd2012", "real_gdp_usd2012", 
                                    "real_gdp_usd2012", "real_gdp_usd2012", "real_gdp_usd2012", 
                                    "real_gdp_usd2012", "real_gdp_usd2012", "real_gdp_usd2012", 
                                    "real_gdp_usd2012", "real_gdp_usd2012", "real_gdp_usd2012", 
                                    "real_gdp_usd2012", "real_gdp_usd2012", "real_gdp_usd2012", 
                                    "real_gdp_usd2012", "real_gdp_usd2012", "real_gdp_usd2012", 
                                    "real_gdp_usd2012", "real_gdp_usd2012", "real_gdp_usd2012", 
                                    "real_gdp_usd2012", "real_gdp_usd2012", "real_gdp_usd2012"
                          ), Value = c(6837.6, 6696.8, 6688.8, 6813.5, 6947, 6895.6, 
                                       6978.1, 6902.1, 6794.9, 6825.9, 6799.8, 6802.5, 6892.1, 7049, 
                                       7189.9, 7339.9, 7483.4, 7612.7, 7686.1, 7749.2, 7824.2, 7893.1, 
                                       8013.7, 8073.2, 8148.6, 8185.3, 8263.6, 8308, 8369.9, 8460.2, 
                                       8533.6, 8680.2, 8725, 8839.6, 8891.4, 9009.9, 9101.5, 9171, 
                                       9238.9, 9257.1, 9358.3, 9392.3, 9398.5, 9312.9, 9269.4, 9341.6, 
                                       9388.8, 9421.6, 9534.3, 9637.7, 9733, 9834.5, 9851, 9908.3, 
                                       9955.6, 10091, 10189, 10327, 10387.4, 10506.4, 10543.6, 10575.1, 
                                       10665.1, 10737.5, 10817.9, 10998.3, 11097, 11212.2, 11284.6, 
                                       11472.1, 11615.6, 11715.4, 11832.5, 11942, 12091.6, 12287, 
                                       12403.3, 12498.7, 12662.4, 12877.6, 12924.2, 13160.8, 13178.4, 
                                       13260.5, 13222.7, 13300, 13244.8, 13280.9, 13397, 13478.2, 
                                       13538.1, 13559, 13634.3, 13751.5, 13985.1, 14145.6, 14221.1, 
                                       14329.5, 14465, 14609.9, 14771.6, 14839.8, 14972.1, 15066.6, 
                                       15267, 15302.7, 15326.4, 15456.9, 15493.3, 15582.1, 15666.7, 
                                       15762, 15671.4, 15752.3, 15667, 15328, 15155.9, 15134.1, 
                                       15189.2, 15356.1, 15415.1, 15557.3, 15672, 15750.6, 15712.8, 
                                       15825.1, 15820.7, 16004.1, 16129.4, 16198.8, 16220.7, 16239.1, 
                                       16383, 16403.2, 16531.7, 16663.6, 16616.5, 16841.5, 17047.1, 
                                       17143, 17277.6, 17405.7, 17463.2, 17468.9, 17556.8, 17639.4, 
                                       17735.1, 17824.2, 17925.3, 18021, 18163.6, 18322.5, 18438.3, 
                                       18598.1, 18732.7, 18783.5, 18927.3, 19021.9, 19112.5)), row.names = c(NA, 
                                                                                                             -159L), class = "data.frame")




# Turn recession dates into recession intervals
recession_intervals <- interval(recession_dates$start, recession_dates$stop)

# Test to see if works
my_df %>% mutate(recession = Date %within% recession_intervals)

Does not work:

   # partial output
> my_df %>% mutate(recession = Date %within% recession_intervals)
          Date            Asset   Value recession
1   1980-03-31 real_gdp_usd2012  6837.6     FALSE
2   1980-06-30 real_gdp_usd2012  6696.8     FALSE
3   1980-09-30 real_gdp_usd2012  6688.8     FALSE
4   1980-12-31 real_gdp_usd2012  6813.5     FALSE
5   1981-03-31 real_gdp_usd2012  6947.0     FALSE
6   1981-06-30 real_gdp_usd2012  6895.6     FALSE
7   1981-09-30 real_gdp_usd2012  6978.1     FALSE
8   1981-12-31 real_gdp_usd2012  6902.1     FALSE
9   1982-03-31 real_gdp_usd2012  6794.9      TRUE
10  1982-06-30 real_gdp_usd2012  6825.9     FALSE
11  1982-09-30 real_gdp_usd2012  6799.8     FALSE
12  1982-12-31 real_gdp_usd2012  6802.5     FALSE
13  1983-03-31 real_gdp_usd2012  6892.1     FALSE

However, this doesn't work as the output does not return TRUE for any date within a recession period (plus it generates lots of warning messages).

If someone could tell me the correct way to use this approach (a different scalable approach using dplyr would be ok too) I would appreciate it!

Dave

Upvotes: 3

Views: 152

Answers (1)

thelatemail
thelatemail

Reputation: 93803

Looking at the help file you linked, you need to make sure the intervals are in a list for them to be compared properly.

recycled according to standard R rules. If b is a list of intervals, a is checked if it falls within any of the intervals in b.

Therefore, this should work:

my_df %>% mutate(recession = Date %within% as.list(recession_intervals))

Upvotes: 3

Related Questions