Reputation: 3577
I have a data frame like so:
library(tidyverse)
#make some data
df <- tibble(ID = c(1, 1, 2, 2),
Year = c(2000, 2003, 2000, 2003),
Value = c(1, 1, 1, 1))
ID Year Value
<dbl> <dbl> <dbl>
1 1 2000 1
2 1 2003 1
3 2 2000 1
4 2 2003 1
Which is missing the year 2001, 2002, 2004 and 2005. I would like to groupby the ID
column and fill the Value
column with NaN. My expected output is:
wanted <- tibble(ID = c(1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2),
Year = c(2000, 2001, 2002, 2003, 2004, 2005, 2000, 2001, 2002, 2003, 2004, 2005),
Value = c(1, NaN, NaN, 1, NaN, NaN, 1, NaN, NaN, 1, NaN, NaN))
ID Year Value
<dbl> <dbl> <dbl>
1 1 2000 1
2 1 2001 NaN
3 1 2002 NaN
4 1 2003 1
5 1 2004 NaN
6 1 2005 NaN
7 2 2000 1
8 2 2001 NaN
9 2 2002 NaN
10 2 2003 1
11 2 2004 NaN
12 2 2005 NaN
I have looked into the complete and fill functions within the tidyverse, but I can't seem to quite get it.
Ideally I would like to give a sequence I would prefer in the Year
column, and then fill all missing years in the Value
column with NaN. I have only supplied a simplified example here. In this case the wanted sequence would be seq(2000, 2005, 1)
.
Upvotes: 1
Views: 393
Reputation: 39154
We can use complete
function to do the job.
library(tidyverse)
df2 <- df %>%
group_by(ID) %>%
complete(Year = full_seq(Year, period = 1), fill = list(Value = NaN)) %>%
ungroup()
Upvotes: 3