Ethan
Ethan

Reputation: 13

Sum of the number of event from each date interval in R

I have a actual data with massive time intervals, i.e. event_start date (say 2014-10-1), event_end date (2014-12-1), and I also have a reference data set which keeps the whether there is a specific event happened within each day, for example: if there is a event happened in 2014-10-2 and 2014-11-3, the record will be "1", otherwise "0”.

what I want to do is to create a variable in the actual data set which indicates the sum of event number between those event start/end dates, i.e. if there were 100 event happened between 2014-10-1 and 2014-12-1, then the variable of this lane will be 100.

I know in R we can use dplyr to group each starting and ending dates and mutate a new variable. Can anyone tell me how to do it exactly?

Much appreciated.

Upvotes: 0

Views: 1267

Answers (2)

Christoph Wolk
Christoph Wolk

Reputation: 1758

Use lubridate to make the time logic simpler.

 library(lubridate)

 starts <- ymd(df$ event_start_date)
 ends <- ymd(df$ event_end_date)
 days <- ymd(reference$date)

 df$numEvents <- mapply(function(s, e) {
    sum(reference$record[days %within% interval(s,e)]) 
 }, starts, ends)

Upvotes: 3

Carl Boneri
Carl Boneri

Reputation: 2722

I had to recreate data for this... please in the future use ?dput to help give a reproducible example.

Defs: evt_df - A data frame with two columns, the first contains a date, the second holds a numeric value of either 0 or 1, representing an event took place on a given date if value is 1

int_df - A data frame which initially will hold 2 columns, start_date and end_date respectively, representing a span of time

Here's the raw data:

Edit.. added the functions to recreate the data in case someone wants to pick apart more

# evt_df
data.frame(
  dates = seq(Sys.Date() - 1200, Sys.Date(), by = "day"), 
  stringsAsFactors = FALSE
) %>% {
  .[['event']] <-  sample(0:1, nrow(.), TRUE)
  assign("evt_df", ., envir = .GlobalEnv)
}


## int_df
data.frame(
  start_date = evt_df[sample(1:nrow(evt_df), 100), 'dates'],
  end_date = evt_df[sample(1:nrow(evt_df), 100),'dates'],
  stringsAsFactors = FALSE
) %>% {
  adply(., 1, function(i){
    if(i[[1]] > i[[2]]){
      c(i[[2]], i[[1]])
    }else if(i[[1]] == i[[2]]){
      c(i[[1]] - 5, i[[2]])
    }else {
      c(i[[1]], i[[2]])
    }
  }, .id = NULL, .expand = FALSE) %>% 
    setNames(c('start_date', 'end_date')) %>% 
    assign("int_df", ., envir = .GlobalEnv)
}
> dput(int_df)
structure(list(start_date = c("2015-03-04", "2015-09-28", "2015-05-23", 
"2015-09-29", "2015-04-27", "2014-06-08", "2015-12-01", "2015-12-21", 
"2014-07-22", "2014-05-30", "2016-04-26", "2016-07-09", "2015-02-14", 
"2017-02-23", "2014-08-19", "2015-12-22", "2016-03-22", "2016-11-21", 
"2016-02-07", "2015-11-15", "2015-02-25", "2015-06-12", "2014-10-11", 
"2014-09-12", "2017-06-30", "2015-04-14", "2016-04-20", "2016-06-21", 
"2015-06-27", "2016-04-15", "2015-11-21", "2015-07-08", "2015-02-18", 
"2014-08-16", "2015-09-27", "2014-09-23", "2014-06-02", "2014-11-17", 
"2017-03-19", "2014-08-14", "2015-05-10", "2015-11-28", "2015-09-19", 
"2016-04-10", "2014-12-31", "2016-08-05", "2014-08-15", "2015-03-31", 
"2015-09-26", "2016-05-21", "2015-03-03", "2015-01-24", "2014-10-19", 
"2014-10-15", "2014-09-08", "2014-08-11", "2015-09-03", "2014-07-17", 
"2015-07-03", "2015-07-30", "2015-01-12", "2016-05-07", "2015-11-28", 
"2015-08-29", "2016-08-30", "2014-11-11", "2016-06-18", "2015-05-28", 
"2017-03-21", "2015-01-18", "2014-08-29", "2016-02-14", "2016-07-02", 
"2016-04-08", "2015-08-11", "2014-06-25", "2016-03-23", "2015-07-28", 
"2016-08-02", "2015-10-12", "2015-11-07", "2015-08-17", "2016-01-24", 
"2016-12-16", "2015-06-06", "2014-07-13", "2014-09-26", "2015-06-14", 
"2016-11-03", "2016-05-04", "2015-03-23", "2014-09-08", "2016-01-10", 
"2014-12-29", "2014-08-09", "2015-05-17", "2014-07-18", "2016-02-25", 
"2016-11-22"), end_date = c("2015-03-19", "2017-03-06", "2015-06-07", 
"2017-01-28", "2015-05-12", "2014-06-22", "2016-11-09", "2017-01-20", 
"2016-06-02", "2014-06-14", "2017-03-15", "2016-07-24", "2015-03-01", 
"2017-03-10", "2014-09-03", "2016-04-29", "2016-12-23", "2016-12-06", 
"2017-08-11", "2015-11-30", "2015-04-26", "2015-10-27", "2014-10-26", 
"2017-01-03", "2017-07-14", "2015-04-29", "2016-10-10", "2017-05-15", 
"2016-09-08", "2017-06-11", "2015-12-06", "2015-07-23", "2015-03-05", 
"2014-08-31", "2016-09-27", "2014-10-08", "2014-06-17", "2014-12-02", 
"2017-04-03", "2014-10-31", "2016-06-15", "2016-06-25", "2015-10-04", 
"2016-04-25", "2016-10-11", "2016-08-20", "2014-08-30", "2015-04-15", 
"2017-05-07", "2016-06-05", "2015-03-18", "2015-02-08", "2017-06-26", 
"2014-10-30", "2015-01-23", "2014-08-26", "2015-09-18", "2017-06-10", 
"2015-07-18", "2015-08-14", "2015-01-27", "2016-05-22", "2015-12-13", 
"2015-09-13", "2017-08-03", "2016-11-15", "2017-08-19", "2015-12-08", 
"2017-04-05", "2016-05-12", "2014-09-13", "2016-02-29", "2016-07-17", 
"2016-04-23", "2015-08-26", "2016-09-19", "2016-04-07", "2015-08-12", 
"2017-04-29", "2016-07-18", "2016-05-27", "2015-09-01", "2016-02-08", 
"2017-07-02", "2015-06-21", "2016-06-08", "2015-05-01", "2016-01-18", 
"2016-11-18", "2016-05-19", "2016-05-02", "2014-09-23", "2016-01-25", 
"2016-08-01", "2015-06-13", "2017-02-24", "2017-03-14", "2016-03-11", 
"2017-01-08")), class = "data.frame", row.names = c(NA, -99L), .Names = c("start_date", 
"end_date"))

> dput(evt_df)
structure(list(date = structure(c(16208, 16209, 16210, 16211, 
16212, 16213, 16214, 16215, 16216, 16217, 16218, 16219, 16220, 
16221, 16222, 16223, 16224, 16225, 16226, 16227, 16228, 16229, 
16230, 16231, 16232, 16233, 16234, 16235, 16236, 16237, 16238, 
16239, 16240, 16241, 16242, 16243, 16244, 16245, 16246, 16247, 
16248, 16249, 16250, 16251, 16252, 16253, 16254, 16255, 16256, 
16257, 16258, 16259, 16260, 16261, 16262, 16263, 16264, 16265, 
16266, 16267, 16268, 16269, 16270, 16271, 16272, 16273, 16274, 
16275, 16276, 16277, 16278, 16279, 16280, 16281, 16282, 16283, 
16284, 16285, 16286, 16287, 16288, 16289, 16290, 16291, 16292, 
16293, 16294, 16295, 16296, 16297, 16298, 16299, 16300, 16301, 
16302, 16303, 16304, 16305, 16306, 16307, 16308, 16309, 16310, 
16311, 16312, 16313, 16314, 16315, 16316, 16317, 16318, 16319, 
16320, 16321, 16322, 16323, 16324, 16325, 16326, 16327, 16328, 
16329, 16330, 16331, 16332, 16333, 16334, 16335, 16336, 16337, 
16338, 16339, 16340, 16341, 16342, 16343, 16344, 16345, 16346, 
16347, 16348, 16349, 16350, 16351, 16352, 16353, 16354, 16355, 
16356, 16357, 16358, 16359, 16360, 16361, 16362, 16363, 16364, 
16365, 16366, 16367, 16368, 16369, 16370, 16371, 16372, 16373, 
16374, 16375, 16376, 16377, 16378, 16379, 16380, 16381, 16382, 
16383, 16384, 16385, 16386, 16387, 16388, 16389, 16390, 16391, 
16392, 16393, 16394, 16395, 16396, 16397, 16398, 16399, 16400, 
16401, 16402, 16403, 16404, 16405, 16406, 16407, 16408, 16409, 
16410, 16411, 16412, 16413, 16414, 16415, 16416, 16417, 16418, 
16419, 16420, 16421, 16422, 16423, 16424, 16425, 16426, 16427, 
16428, 16429, 16430, 16431, 16432, 16433, 16434, 16435, 16436, 
16437, 16438, 16439, 16440, 16441, 16442, 16443, 16444, 16445, 
16446, 16447, 16448, 16449, 16450, 16451, 16452, 16453, 16454, 
16455, 16456, 16457, 16458, 16459, 16460, 16461, 16462, 16463, 
16464, 16465, 16466, 16467, 16468, 16469, 16470, 16471, 16472, 
16473, 16474, 16475, 16476, 16477, 16478, 16479, 16480, 16481, 
16482, 16483, 16484, 16485, 16486, 16487, 16488, 16489, 16490, 
16491, 16492, 16493, 16494, 16495, 16496, 16497, 16498, 16499, 
16500, 16501, 16502, 16503, 16504, 16505, 16506, 16507, 16508, 
16509, 16510, 16511, 16512, 16513, 16514, 16515, 16516, 16517, 
16518, 16519, 16520, 16521, 16522, 16523, 16524, 16525, 16526, 
16527, 16528, 16529, 16530, 16531, 16532, 16533, 16534, 16535, 
16536, 16537, 16538, 16539, 16540, 16541, 16542, 16543, 16544, 
16545, 16546, 16547, 16548, 16549, 16550, 16551, 16552, 16553, 
16554, 16555, 16556, 16557, 16558, 16559, 16560, 16561, 16562, 
16563, 16564, 16565, 16566, 16567, 16568, 16569, 16570, 16571, 
16572, 16573, 16574, 16575, 16576, 16577, 16578, 16579, 16580, 
16581, 16582, 16583, 16584, 16585, 16586, 16587, 16588, 16589, 
16590, 16591, 16592, 16593, 16594, 16595, 16596, 16597, 16598, 
16599, 16600, 16601, 16602, 16603, 16604, 16605, 16606, 16607, 
16608, 16609, 16610, 16611, 16612, 16613, 16614, 16615, 16616, 
16617, 16618, 16619, 16620, 16621, 16622, 16623, 16624, 16625, 
16626, 16627, 16628, 16629, 16630, 16631, 16632, 16633, 16634, 
16635, 16636, 16637, 16638, 16639, 16640, 16641, 16642, 16643, 
16644, 16645, 16646, 16647, 16648, 16649, 16650, 16651, 16652, 
16653, 16654, 16655, 16656, 16657, 16658, 16659, 16660, 16661, 
16662, 16663, 16664, 16665, 16666, 16667, 16668, 16669, 16670, 
16671, 16672, 16673, 16674, 16675, 16676, 16677, 16678, 16679, 
16680, 16681, 16682, 16683, 16684, 16685, 16686, 16687, 16688, 
16689, 16690, 16691, 16692, 16693, 16694, 16695, 16696, 16697, 
16698, 16699, 16700, 16701, 16702, 16703, 16704, 16705, 16706, 
16707, 16708, 16709, 16710, 16711, 16712, 16713, 16714, 16715, 
16716, 16717, 16718, 16719, 16720, 16721, 16722, 16723, 16724, 
16725, 16726, 16727, 16728, 16729, 16730, 16731, 16732, 16733, 
16734, 16735, 16736, 16737, 16738, 16739, 16740, 16741, 16742, 
16743, 16744, 16745, 16746, 16747, 16748, 16749, 16750, 16751, 
16752, 16753, 16754, 16755, 16756, 16757, 16758, 16759, 16760, 
16761, 16762, 16763, 16764, 16765, 16766, 16767, 16768, 16769, 
16770, 16771, 16772, 16773, 16774, 16775, 16776, 16777, 16778, 
16779, 16780, 16781, 16782, 16783, 16784, 16785, 16786, 16787, 
16788, 16789, 16790, 16791, 16792, 16793, 16794, 16795, 16796, 
16797, 16798, 16799, 16800, 16801, 16802, 16803, 16804, 16805, 
16806, 16807, 16808, 16809, 16810, 16811, 16812, 16813, 16814, 
16815, 16816, 16817, 16818, 16819, 16820, 16821, 16822, 16823, 
16824, 16825, 16826, 16827, 16828, 16829, 16830, 16831, 16832, 
16833, 16834, 16835, 16836, 16837, 16838, 16839, 16840, 16841, 
16842, 16843, 16844, 16845, 16846, 16847, 16848, 16849, 16850, 
16851, 16852, 16853, 16854, 16855, 16856, 16857, 16858, 16859, 
16860, 16861, 16862, 16863, 16864, 16865, 16866, 16867, 16868, 
16869, 16870, 16871, 16872, 16873, 16874, 16875, 16876, 16877, 
16878, 16879, 16880, 16881, 16882, 16883, 16884, 16885, 16886, 
16887, 16888, 16889, 16890, 16891, 16892, 16893, 16894, 16895, 
16896, 16897, 16898, 16899, 16900, 16901, 16902, 16903, 16904, 
16905, 16906, 16907, 16908, 16909, 16910, 16911, 16912, 16913, 
16914, 16915, 16916, 16917, 16918, 16919, 16920, 16921, 16922, 
16923, 16924, 16925, 16926, 16927, 16928, 16929, 16930, 16931, 
16932, 16933, 16934, 16935, 16936, 16937, 16938, 16939, 16940, 
16941, 16942, 16943, 16944, 16945, 16946, 16947, 16948, 16949, 
16950, 16951, 16952, 16953, 16954, 16955, 16956, 16957, 16958, 
16959, 16960, 16961, 16962, 16963, 16964, 16965, 16966, 16967, 
16968, 16969, 16970, 16971, 16972, 16973, 16974, 16975, 16976, 
16977, 16978, 16979, 16980, 16981, 16982, 16983, 16984, 16985, 
16986, 16987, 16988, 16989, 16990, 16991, 16992, 16993, 16994, 
16995, 16996, 16997, 16998, 16999, 17000, 17001, 17002, 17003, 
17004, 17005, 17006, 17007, 17008, 17009, 17010, 17011, 17012, 
17013, 17014, 17015, 17016, 17017, 17018, 17019, 17020, 17021, 
17022, 17023, 17024, 17025, 17026, 17027, 17028, 17029, 17030, 
17031, 17032, 17033, 17034, 17035, 17036, 17037, 17038, 17039, 
17040, 17041, 17042, 17043, 17044, 17045, 17046, 17047, 17048, 
17049, 17050, 17051, 17052, 17053, 17054, 17055, 17056, 17057, 
17058, 17059, 17060, 17061, 17062, 17063, 17064, 17065, 17066, 
17067, 17068, 17069, 17070, 17071, 17072, 17073, 17074, 17075, 
17076, 17077, 17078, 17079, 17080, 17081, 17082, 17083, 17084, 
17085, 17086, 17087, 17088, 17089, 17090, 17091, 17092, 17093, 
17094, 17095, 17096, 17097, 17098, 17099, 17100, 17101, 17102, 
17103, 17104, 17105, 17106, 17107, 17108, 17109, 17110, 17111, 
17112, 17113, 17114, 17115, 17116, 17117, 17118, 17119, 17120, 
17121, 17122, 17123, 17124, 17125, 17126, 17127, 17128, 17129, 
17130, 17131, 17132, 17133, 17134, 17135, 17136, 17137, 17138, 
17139, 17140, 17141, 17142, 17143, 17144, 17145, 17146, 17147, 
17148, 17149, 17150, 17151, 17152, 17153, 17154, 17155, 17156, 
17157, 17158, 17159, 17160, 17161, 17162, 17163, 17164, 17165, 
17166, 17167, 17168, 17169, 17170, 17171, 17172, 17173, 17174, 
17175, 17176, 17177, 17178, 17179, 17180, 17181, 17182, 17183, 
17184, 17185, 17186, 17187, 17188, 17189, 17190, 17191, 17192, 
17193, 17194, 17195, 17196, 17197, 17198, 17199, 17200, 17201, 
17202, 17203, 17204, 17205, 17206, 17207, 17208, 17209, 17210, 
17211, 17212, 17213, 17214, 17215, 17216, 17217, 17218, 17219, 
17220, 17221, 17222, 17223, 17224, 17225, 17226, 17227, 17228, 
17229, 17230, 17231, 17232, 17233, 17234, 17235, 17236, 17237, 
17238, 17239, 17240, 17241, 17242, 17243, 17244, 17245, 17246, 
17247, 17248, 17249, 17250, 17251, 17252, 17253, 17254, 17255, 
17256, 17257, 17258, 17259, 17260, 17261, 17262, 17263, 17264, 
17265, 17266, 17267, 17268, 17269, 17270, 17271, 17272, 17273, 
17274, 17275, 17276, 17277, 17278, 17279, 17280, 17281, 17282, 
17283, 17284, 17285, 17286, 17287, 17288, 17289, 17290, 17291, 
17292, 17293, 17294, 17295, 17296, 17297, 17298, 17299, 17300, 
17301, 17302, 17303, 17304, 17305, 17306, 17307, 17308, 17309, 
17310, 17311, 17312, 17313, 17314, 17315, 17316, 17317, 17318, 
17319, 17320, 17321, 17322, 17323, 17324, 17325, 17326, 17327, 
17328, 17329, 17330, 17331, 17332, 17333, 17334, 17335, 17336, 
17337, 17338, 17339, 17340, 17341, 17342, 17343, 17344, 17345, 
17346, 17347, 17348, 17349, 17350, 17351, 17352, 17353, 17354, 
17355, 17356, 17357, 17358, 17359, 17360, 17361, 17362, 17363, 
17364, 17365, 17366, 17367, 17368, 17369, 17370, 17371, 17372, 
17373, 17374, 17375, 17376, 17377, 17378, 17379, 17380, 17381, 
17382, 17383, 17384, 17385, 17386, 17387, 17388, 17389, 17390, 
17391, 17392, 17393, 17394, 17395, 17396, 17397, 17398, 17399, 
17400, 17401, 17402, 17403, 17404, 17405, 17406, 17407, 17408
), class = "Date"), event = c(1L, 0L, 0L, 1L, 0L, 1L, 1L, 1L, 
1L, 1L, 1L, 0L, 1L, 1L, 0L, 0L, 1L, 1L, 1L, 0L, 0L, 1L, 0L, 0L, 
0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 0L, 0L, 1L, 1L, 1L, 0L, 
0L, 0L, 0L, 0L, 1L, 1L, 1L, 1L, 0L, 1L, 0L, 1L, 0L, 1L, 1L, 1L, 
1L, 0L, 1L, 1L, 1L, 0L, 0L, 1L, 1L, 0L, 1L, 0L, 1L, 1L, 1L, 1L, 
1L, 0L, 1L, 0L, 0L, 0L, 0L, 1L, 0L, 1L, 0L, 0L, 0L, 0L, 1L, 0L, 
0L, 1L, 0L, 0L, 0L, 1L, 1L, 1L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 
0L, 1L, 1L, 0L, 1L, 1L, 0L, 1L, 1L, 1L, 0L, 0L, 0L, 1L, 1L, 1L, 
0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 1L, 1L, 1L, 1L, 1L, 
1L, 0L, 1L, 1L, 1L, 1L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 0L, 0L, 0L, 
1L, 1L, 1L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 0L, 1L, 0L, 
1L, 1L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 1L, 0L, 0L, 0L, 1L, 1L, 0L, 
1L, 0L, 1L, 0L, 1L, 1L, 1L, 0L, 1L, 1L, 0L, 0L, 1L, 1L, 1L, 0L, 
0L, 1L, 0L, 1L, 1L, 1L, 1L, 0L, 0L, 1L, 1L, 1L, 1L, 0L, 0L, 0L, 
0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L, 1L, 1L, 1L, 1L, 1L, 1L, 0L, 1L, 0L, 1L, 0L, 0L, 0L, 
0L, 0L, 0L, 1L, 0L, 1L, 0L, 1L, 1L, 0L, 0L, 1L, 0L, 0L, 1L, 0L, 
0L, 1L, 1L, 0L, 0L, 0L, 1L, 1L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 
0L, 1L, 1L, 1L, 1L, 1L, 1L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 0L, 1L, 
0L, 1L, 1L, 1L, 1L, 0L, 0L, 1L, 1L, 1L, 1L, 1L, 0L, 0L, 1L, 1L, 
0L, 1L, 0L, 1L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 0L, 0L, 0L, 1L, 1L, 
0L, 0L, 0L, 1L, 0L, 1L, 1L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 0L, 
1L, 0L, 1L, 1L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 0L, 0L, 0L, 0L, 0L, 
0L, 1L, 0L, 0L, 1L, 1L, 1L, 1L, 0L, 0L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 0L, 1L, 1L, 0L, 1L, 1L, 1L, 1L, 0L, 0L, 0L, 1L, 1L, 0L, 1L, 
0L, 0L, 0L, 1L, 0L, 1L, 1L, 1L, 1L, 1L, 1L, 0L, 0L, 1L, 1L, 1L, 
0L, 1L, 0L, 1L, 0L, 0L, 0L, 1L, 1L, 1L, 0L, 0L, 0L, 0L, 0L, 1L, 
1L, 1L, 0L, 1L, 1L, 0L, 0L, 0L, 1L, 1L, 1L, 0L, 0L, 0L, 0L, 1L, 
0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 0L, 1L, 1L, 1L, 
1L, 0L, 1L, 0L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 0L, 0L, 0L, 0L, 1L, 
1L, 1L, 1L, 0L, 1L, 1L, 1L, 1L, 0L, 1L, 0L, 0L, 0L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 0L, 0L, 1L, 1L, 0L, 0L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 0L, 1L, 0L, 1L, 1L, 0L, 1L, 1L, 0L, 1L, 0L, 
1L, 0L, 0L, 1L, 0L, 1L, 0L, 0L, 0L, 1L, 0L, 1L, 0L, 0L, 0L, 0L, 
0L, 1L, 1L, 0L, 1L, 1L, 1L, 0L, 0L, 0L, 0L, 1L, 1L, 0L, 0L, 0L, 
1L, 1L, 1L, 0L, 0L, 1L, 0L, 0L, 1L, 0L, 1L, 1L, 0L, 0L, 0L, 1L, 
1L, 1L, 0L, 0L, 1L, 0L, 0L, 1L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 
1L, 1L, 0L, 1L, 0L, 1L, 1L, 0L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 0L, 
0L, 0L, 1L, 1L, 1L, 0L, 1L, 0L, 1L, 1L, 0L, 0L, 1L, 1L, 0L, 1L, 
0L, 0L, 1L, 1L, 0L, 1L, 0L, 1L, 1L, 0L, 0L, 0L, 0L, 1L, 1L, 0L, 
0L, 0L, 0L, 0L, 1L, 1L, 1L, 1L, 1L, 0L, 1L, 0L, 0L, 1L, 0L, 0L, 
1L, 0L, 1L, 0L, 1L, 1L, 0L, 1L, 0L, 1L, 1L, 0L, 1L, 0L, 0L, 1L, 
1L, 1L, 1L, 1L, 1L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 1L, 0L, 0L, 1L, 
1L, 0L, 0L, 1L, 0L, 1L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 1L, 1L, 1L, 
0L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 0L, 1L, 0L, 1L, 0L, 0L, 1L, 1L, 
1L, 1L, 0L, 1L, 1L, 0L, 0L, 0L, 1L, 1L, 1L, 1L, 1L, 0L, 1L, 0L, 
0L, 1L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 1L, 0L, 0L, 
1L, 1L, 1L, 0L, 0L, 0L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 0L, 0L, 0L, 
1L, 1L, 0L, 0L, 1L, 1L, 1L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 
1L, 0L, 1L, 1L, 0L, 1L, 0L, 1L, 1L, 1L, 1L, 0L, 0L, 0L, 1L, 1L, 
1L, 0L, 1L, 0L, 0L, 1L, 1L, 1L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 1L, 
0L, 1L, 0L, 1L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 0L, 0L, 
1L, 1L, 0L, 0L, 1L, 0L, 1L, 1L, 0L, 0L, 0L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 0L, 1L, 0L, 1L, 1L, 0L, 1L, 1L, 0L, 1L, 1L, 1L, 0L, 1L, 
0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 0L, 1L, 0L, 1L, 1L, 0L, 0L, 1L, 
1L, 1L, 0L, 1L, 1L, 0L, 1L, 0L, 0L, 1L, 1L, 0L, 1L, 1L, 0L, 1L, 
0L, 0L, 0L, 1L, 1L, 0L, 1L, 0L, 1L, 0L, 0L, 1L, 0L, 1L, 0L, 0L, 
1L, 0L, 1L, 1L, 0L, 0L, 1L, 0L, 1L, 0L, 0L, 0L, 1L, 0L, 1L, 1L, 
0L, 0L, 0L, 1L, 1L, 1L, 0L, 1L, 0L, 1L, 1L, 1L, 0L, 1L, 0L, 0L, 
1L, 1L, 1L, 1L, 0L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 0L, 1L, 
1L, 0L, 0L, 1L, 1L, 1L, 1L, 1L, 0L, 0L, 1L, 1L, 0L, 1L, 0L, 0L, 
1L, 0L, 0L, 0L, 1L, 1L, 0L, 1L, 1L, 1L, 0L, 1L, 0L, 0L, 0L, 0L, 
1L, 0L, 0L, 1L, 0L, 1L, 1L, 0L, 0L, 1L, 1L, 0L, 1L, 1L, 1L, 0L, 
0L, 1L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 1L, 0L, 1L, 1L, 1L, 0L, 1L, 
1L, 0L, 0L, 0L, 1L, 0L, 1L, 1L, 0L, 1L, 0L, 1L, 0L, 0L, 0L, 0L, 
0L, 0L, 1L, 1L, 0L, 0L, 1L, 1L, 1L, 0L, 1L, 1L, 1L, 0L, 0L, 1L, 
1L, 1L, 1L, 1L, 0L, 1L, 1L, 0L, 1L, 0L, 1L, 1L, 1L, 0L, 0L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 0L, 
1L, 1L, 1L, 1L, 1L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 0L, 1L, 0L, 0L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 0L, 1L, 0L, 0L, 1L, 1L, 1L, 0L, 0L, 
0L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 1L, 1L, 0L, 1L, 1L, 1L, 1L, 1L, 
0L, 1L, 1L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 
1L, 1L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 1L, 1L, 0L, 1L, 0L, 1L, 
1L, 0L, 1L, 1L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 0L, 0L, 1L, 1L, 0L, 
0L, 1L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 1L, 1L, 0L, 1L, 0L, 0L, 0L, 
1L, 1L, 0L, 1L, 0L, 0L, 1L, 0L, 1L)), .Names = c("date", "event"
), row.names = c(NA, -1201L), class = "data.frame")

The evaluation

To evaluate essentially what I understood from your question was we want to use one static table as a reference, or key table, to then find all the iterations that fall between our two dates, contained in each row of our second table, which we will then total the event values, and create a new, third column in our second table, so that each row will represent: start_date - our beginning date end_date - our ending date closing the time-window total_events - The sum of all events which took place on, or between, our start and end dates

Note: I'm going to use the plyr package... which will generate flack but it's my preferred approach to this solution.

The idea is this...

  1. we're going to go row-by-row through our second table, which has a start and end date column.
  2. For each iteration, we're going to find the start_date value (int_df[i,"start_date"]) and the end_date values (int_df[i, "end_date"]) in our lookup table by their POSITION. Meaning, for row 12 of our int_df, which row in our evt_df contains that date, and the same for the end date.
  3. With that numerical value, which represents the row position, we will create a sequence of numbers that are the range between the matched start and end dates.Think in terms of "Start date is row 12, end date is row 15, so get all rows from 12 through 15"
    1. Extract the applicable rows, and get the column which contains the event values and simply sum them, so that for each range we have a total number of events

Please see ?plyr::adply for better understanding of what you'll see below. Short order is: For each slice of an array, apply function then combine results into a data frame.

The reason this gives us exactly what we want is that we are treating each row of our key table as an array which we will then

library(plyr)
library(dplyr)
> new_df <- adply(int_df, 1, function(i){
        s <- which(evt_df$date == i[['start_date']])
        e <- which(evt_df$date == i[['end_date']])
        sum(evt_df[seq(s, e), 2])
  }, .id = NULL, .expand = T) %>% select(1, 2, total_events = 3)

> head(new_df)
  start_date   end_date total_events
1 2015-03-04 2015-03-19           11
2 2015-09-28 2017-03-06          270
3 2015-05-23 2015-06-07           13
4 2015-09-29 2017-01-28          250
5 2015-04-27 2015-05-12            6
6 2014-06-08 2014-06-22            4

Hopefully that helps you and gives an understanding of the process approach

Upvotes: 1

Related Questions