Reputation: 55
I have an empty tibble full of NA's with the rows and columns, respectively named for id numbers and all the dates in a period of time. For example, this code:
tbl <- tibble(PERSONAL_ID = c("A", "B", "C", "D"))
dates = as.character((seq(as.Date("2016-01-01"), as.Date("2016-01-05"), by="days")))
tbl[dates] <- NA
tbl <- column_to_rownames(tbl, var = "PERSONAL_ID")
I have a second tibble that contains columns matching up one ID number with one date, as in this example:
enrollments <- tibble(ID = c("D", "B", "C", "D"),
date = c("2016-01-01", "2016-01-03", "2016-01-05", "2016-01-02"))
What I would like to do is add "1" to the row and column of the first tibble (tbl) corresponding to the ID and date listed in the second tibble (enrollments). For the example code above, the desired output would be:
2016-01-01 2016-01-02 2016-01-03 2016-01-04 2016-01-05
A <NA> <NA> <NA> <NA> <NA>
B <NA> <NA> 1 <NA> <NA>
C <NA> <NA> <NA> <NA> 1
D 1 1 <NA> <NA> <NA>
Thank you!
Upvotes: 2
Views: 748
Reputation: 24722
Here's a data.table approach, using PERSONAL_IDS
, dates
, and enrollments
dcast(rbind(
rbindlist(lapply(setdiff(PERSONAL_ID,enrollments$ID), \(x) data.table(ID=x, date=dates)))[,value:=NA],
enrollments[,value:=1]
), ID~date,value.var="value")
Output:
ID 2016-01-01 2016-01-02 2016-01-03 2016-01-04 2016-01-05
1: A NA NA NA NA NA
2: B NA NA 1 NA NA
3: C NA NA NA NA 1
4: D 1 1 NA NA NA
Upvotes: 0
Reputation: 16856
Here is a base R approach. For enrollments, I combine the date
and ID
into one string for each row. Then, for tbl
, I create a table that is filled with the date and rowname. Then, I use match
to match the values from enrollment
with those in tbl
. I use arrayInd
to get the column and row index. Finally, I replace
the NA
values in tbl
with 1
for the extracted column-row indices.
replace(tbl, arrayInd(match(do.call(
paste, subset(enrollments, select = c("date", "ID"))),
t(outer(colnames(tbl), rownames(tbl), FUN = paste)
)), .dim = dim(tbl)), 1)
Here is a slightly different tidyverse
approach by using rows_update
. Here, I pivot enrollments
to a wide format, then bind to tbl
(but with no rows) in order to have the same columns. Then, I update the rows in tbl
with the new format from enrollments
.
library(tidyverse)
rows_update(tbl %>% rownames_to_column("ID"), enrollments %>%
mutate(value = 1) %>%
pivot_wider(names_from = "date", values_from = "value") %>%
bind_rows(tbl[0,])) %>%
column_to_rownames("ID")
Output
2016-01-01 2016-01-02 2016-01-03 2016-01-04 2016-01-05
A NA NA NA NA NA
B NA NA 1 NA NA
C NA NA NA NA 1
D 1 1 NA NA NA
Upvotes: 2
Reputation: 19097
Here's a tidyverse
approach.
tbl
from a wide
format to a long
format so that it matches the format of enrollments
.Count
column in enrollments
, and every row would be Count = 1
.left_join
the transformed tbl
with enrollments
using ID
and date
as the joining field.long
format back to a wide
format and set rownames
.library(tidyverse)
left_join(tbl %>% rownames_to_column(var = "ID") %>%
pivot_longer(-ID, names_to = "date", values_to = "Count") %>%
select(-Count),
enrollments %>% mutate(Count = 1),
by = c("ID", "date")) %>%
pivot_wider(names_from = "date", values_from = "Count") %>%
column_to_rownames(var = "ID")
2016-01-01 2016-01-02 2016-01-03 2016-01-04 2016-01-05
A NA NA NA NA NA
B NA NA 1 NA NA
C NA NA NA NA 1
D 1 1 NA NA NA
tbl
2016-01-01 2016-01-02 2016-01-03 2016-01-04 2016-01-05
A NA NA NA NA NA
B NA NA NA NA NA
C NA NA NA NA NA
D NA NA NA NA NA
enrollments
# A tibble: 4 x 2
ID date
<chr> <chr>
1 D 2016-01-01
2 B 2016-01-03
3 C 2016-01-05
4 D 2016-01-02
Upvotes: 2