T.McMillen
T.McMillen

Reputation: 137

How do I get the total number of tests performed based on ID and date range with R?

My data looks like this. Some people may have up to 48 observations spread over 4 years. I'm looking to get specific totals based on date ranges.

Name <- c("Doe, John","Doe, John","Doe, John", "Doe, Jane", "Doe, Jane","Doe, Jane", "Doe, Jane")
Accession <- c(123, 234, 345, 456, 567, 678, 789)
MRN <-c(55555, 55555, 55555, 66666, 66666, 66666, 66666)
Collected <-c("2022-01-05", "2022-01-06", "2022-01-07", "2022-01-08", "2022-01-09", "2022-01-20", "2022-01-15")
Result <-c("Detected", "Negative", "Detected", "Negative", "Negative", "Negative", "Detected")

CV <- data.frame(Name, Accession, MRN, Collected, Result)

      Name      Accession           MRN      Collected    Result
1   Doe, John     123              55555      2022-01-05   Detected
2   Doe, John     234              55555      2022-01-06   Negative
3   Doe, John     345              55555      2022-01-07   Detected
4   Doe, Jane     456              66666      2022-01-08   Negative
5   Doe, Jane     567              66666      2022-01-09   Negative
6   Doe, Jane     678              66666      2022-01-20   Negative

I would like to tally observations based on MRN that are +/- 1 day and +/- 2 days from any "Collected" date in the dataframe. I would like my output to look like this:

Name                 MRN        +/-1day_Number_of_Tests            +/-2days_Number_of_Tests
Doe, John           55555                3                                      2
Doe, Jane           66666                3                                      2

Upvotes: 0

Views: 47

Answers (1)

Shahab Einabadi
Shahab Einabadi

Reputation: 342

The question is somewhat unclear, but based on my interpretation, this solution should work. Additionally, I changed the type of the Collected vector to date.

Name <- c("Doe, John","Doe, John","Doe, John", "Doe, Jane", "Doe, Jane","Doe, Jane", "Doe, Jane")
Accession <- c(123, 234, 345, 456, 567, 678, 789)
MRN <-c(55555, 55555, 55555, 66666, 66666, 66666, 66666)
Collected <- as.Date(c("2022-01-05", "2022-01-06", "2022-01-07", "2022-01-08", "2022-01-09", "2022-01-20", "2022-01-15"), tz = 'America/New_York')
Result <-c("Detected", "Negative", "Detected", "Negative", "Negative", "Negative", "Detected")
CV <- data.frame(Name, Accession, MRN, Collected, Result)

library(tidyverse)

CV %>% 
  group_by(Name, MRN) %>% 
  summarise(
    day_1 = sum(between(Collected, Collected - 1, Collected + 1), na.rm = TRUE),
    day_2 = sum(between(Collected, Collected - 2, Collected + 2), na.rm = TRUE)
  ) %>% 
  ungroup()

Upvotes: 0

Related Questions