Reputation: 158
I need all days a patient is in the hospital , days between AufnDat and EntlDat. I use seq with mutate as shown in the reprex I get the error: Error in seq.Date(from = dmy(AufnDat), to = dmy(EntlDat), by = "day"): 'from' muss Länge 1 haben
``` r
library(tidyverse)
library(lubridate)
#>
#> Attache Paket: 'lubridate'
#> The following object is masked from 'package:base':
#>
#> date
df <- tibble::tribble(
~PLZ, ~AufnDat, ~EntlDat, ~Stat1,
67067, "22.01.2020", "23.01.2020", "1A",
74936, "21.01.2020", "23.01.2020", "1A",
97999, "21.01.2020", "27.01.2020", "0A",
76669, "15.01.2020", "17.01.2020", "1",
69126, "07.01.2020", "13.01.2020", "0A",
68753, "13.01.2020", "17.01.2020", "1A",
74928, "23.01.2020", "28.01.2020", "2A",
64720, "10.01.2020", "13.01.2020", "1",
48599, "16.01.2020", "22.01.2020", "0A",
68723, "07.01.2020", "13.01.2020", "0A"
)
df <- df %>% mutate(seqdat = seq(from=dmy(AufnDat), to=dmy(EntlDat), by="day"))
#> Error in seq.Date(from = dmy(AufnDat), to = dmy(EntlDat), by = "day"): 'from' muss Länge 1 haben
```
if I try the same for one row without mutate it works
``` r
## this works:
seq(from=dmy(df[1,2]), to=dmy(df[1,3]), by="day")
#> [1] "2020-01-22" "2020-01-23"
```
Created on 2020-02-21 by the reprex package (v0.3.0)
Where is the difference and my fault ?
Upvotes: 1
Views: 102
Reputation: 6485
Alternatively to @akruns answer we can store seqdat
as list of lists:
library(dplyr)
library(lubridate)
df_new <- df %>%
mutate(ID = row_number()) %>%
group_by(ID) %>%
mutate(seqdat = list(seq(from=dmy(AufnDat), to=dmy(EntlDat), by="day"))) %>%
ungroup()
df_new is now:
df_new
# A tibble: 10 x 6
# Groups: ID [10]
PLZ AufnDat EntlDat Stat1 ID seqdat
<dbl> <chr> <chr> <chr> <int> <list>
1 67067 22.01.2020 23.01.2020 1A 1 <date [2]>
2 74936 21.01.2020 23.01.2020 1A 2 <date [3]>
3 97999 21.01.2020 27.01.2020 0A 3 <date [7]>
4 76669 15.01.2020 17.01.2020 1 4 <date [3]>
5 69126 07.01.2020 13.01.2020 0A 5 <date [7]>
6 68753 13.01.2020 17.01.2020 1A 6 <date [5]>
7 74928 23.01.2020 28.01.2020 2A 7 <date [6]>
8 64720 10.01.2020 13.01.2020 1 8 <date [4]>
9 48599 16.01.2020 22.01.2020 0A 9 <date [7]>
10 68723 07.01.2020 13.01.2020 0A 10 <date [7]>
and
df_new$seqdat[1]
Returns:
[[1]]
[1] "2020-01-22" "2020-01-23"
Upvotes: 1
Reputation: 886948
We can use map2
to loop over each of the 'Aufndat', 'EntlDat' to get the sequence of 'Date' as seq
is not vectorized for the from
, to
. It would be a list
column, which we can unnest
to expand the dataset
library(dplyr)
library(purrr)
library(tidyr)
df %>%
mutate(seqdat = map2(dmy(AufnDat), dmy(EntlDat), seq, by = 'day')) %>%
unnest(c(seqdat))
Upvotes: 1