R - calculate daily/weekly rate with changing denominator

Question

I'm trying to calculate a daily/weekly prevalence rate for a condition, but the sample size in the denominator varies over time. I have a dataset that includes the date on which each subject entered and left the sample (e.g. birth/death dates), the date on which each subject contracted the condition if applicable, and some demographic characteristics.

How can I calculate the total number of people who were in the sample by day (or by week)?
How can I calculate the daily (or weekly) prevalence rate of the condition given this changing denominator?
For the purposes of statistical inference (e.g. assessing whether the condition's prevalence varies before/after a certain date, controlling for demographic characteristics), how can I include the demographic information in the output generated from #1 and #2?

Example data:

ex <- data.frame(
  id=seq(1:10),
  birth=as.Date(c("12/01/2020", "12/01/2020", "12/01/2020", "12/02/2020", "12/02/2020",
                  "12/02/2020", "12/03/2020", "12/04/2020", "12/04/2020", "12/04/2020")),
  sick=as.Date(c("12/03/2020", "12/04/2020", "12/02/2020", "12/03/2020", "12/06/2020",
                 NA, "12/06/2020", "12/07/2020", "12/09/2020", NA)),
  death=as.Date(c("12/05/2020", "12/05/2020", "12/04/2020", "12/08/2020", "12/07/2020",
                  NA, "12/07/2020", "12/09/2020", "12/10/2020", NA)),
  gender=c("male", "male", "female", "female", "female", "male", "female", "male", "male", "male")
)

Desired output:

sick <- data.frame(
  date=c("12/01/2020", "12/02/2020", "12/03/2020", "12/04/2020", "12/05/2020",
           "12/06/2020", "12/07/2020", "12/08/2020", "12/09/2020", "12/10/2020"),
  count_alive=c(3, 6, 7, 9, 7, 7, 5, 4, 3, 2),
  count_sick=c(0, 1, 3, 4, 1, 4, 2, 1, 1, 0)
)

sick$pct_sick <- sick$count_sick/sick$count_alive*100

R - calculate daily/weekly rate with changing denominator

Answers (1)

Related Questions