Reputation: 773
I have the following data,
id <- c(rep(12, 10), rep(14, 12), rep(16, 2))
m <- c(seq(1:5), seq(8,12), seq(1:12), 10, 12)
y <- c(rep(14, 10), rep(14, 12), rep(15, 2))
v <- rnorm(24)
df <- data.frame(id, m, y, v)
> df
id m y v
1 12 1 14 0.9453216
2 12 2 14 1.0666393
3 12 3 14 -0.2750527
4 12 4 14 1.3264349
5 12 5 14 -1.8046676
6 12 8 14 0.3334960
7 12 9 14 -1.2448408
8 12 10 14 0.5258248
9 12 11 14 -0.1233157
10 12 12 14 1.4717530
11 14 1 14 0.6217376
12 14 2 14 -0.8344823
13 14 3 14 1.1468841
14 14 4 14 -0.3363987
15 14 5 14 -1.3543311
16 14 6 14 -0.2146853
17 14 7 14 -0.6546186
18 14 8 14 -2.4286257
19 14 9 14 -1.3314888
20 14 10 14 0.8215581
21 14 11 14 -0.9999368
22 14 12 14 -1.2935147
23 16 10 15 0.7339261
24 16 12 15 1.1303524
The first column is the id, second column m
is the month, third column y
is the year, and the last column is the value.
In the month column, in the year 14, two observations (June and July) is missing and in the year 15, November is missing.
I would like to have those missing months with a value of zero. That means, for example, for the year 15, the data should look like this,
16 10 15 0.7339261
16 11 15 0
16 12 15 1.1303524
Anyone can suggest a way to do that?
Upvotes: 1
Views: 103
Reputation: 25225
Or in data.table
, generate the months for each id and year, left join this with original dataset on id, y, m and then replace NAs with 0:
library(data.table)
setDT(df)
df[df[, .(m=min(m):max(m)), by=.(id, y)], on=.(id,y,m)][
is.na(v), v := 0]
Upvotes: 4
Reputation: 39858
With dplyr
and tidyr
, you can do:
df %>%
group_by(id) %>%
complete(m = seq(min(m), max(m), 1), fill = list(v = 0)) %>%
fill(y)
id m y v
<dbl> <dbl> <dbl> <dbl>
1 12 1 14 0.539
2 12 2 14 -0.0768
3 12 3 14 1.85
4 12 4 14 -0.855
5 12 5 14 0.0326
6 12 6 14 0
7 12 7 14 0
8 12 8 14 -1.03
9 12 9 14 -0.982
10 12 10 14 0.00410
11 12 11 14 -0.233
12 12 12 14 -0.499
13 14 1 14 1.55
14 14 2 14 0.0875
15 14 3 14 1.32
16 14 4 14 -0.981
17 14 5 14 -0.246
18 14 6 14 -1.40
19 14 7 14 1.44
20 14 8 14 -0.981
21 14 9 14 1.47
22 14 10 14 -0.991
23 14 11 14 -0.0945
24 14 12 14 -2.88
25 16 10 15 -0.247
26 16 11 15 0
27 16 12 15 0.0147
Upvotes: 1