small_lebowski
small_lebowski

Reputation: 773

Filling the gap in the time series

I have the following data,

id <- c(rep(12, 10), rep(14, 12), rep(16, 2))
m <- c(seq(1:5), seq(8,12), seq(1:12), 10, 12)
y <- c(rep(14, 10), rep(14, 12), rep(15, 2))
v <- rnorm(24)
df <- data.frame(id, m, y, v)
> df
   id  m  y          v
1  12  1 14  0.9453216
2  12  2 14  1.0666393
3  12  3 14 -0.2750527
4  12  4 14  1.3264349
5  12  5 14 -1.8046676
6  12  8 14  0.3334960
7  12  9 14 -1.2448408
8  12 10 14  0.5258248
9  12 11 14 -0.1233157
10 12 12 14  1.4717530
11 14  1 14  0.6217376
12 14  2 14 -0.8344823
13 14  3 14  1.1468841
14 14  4 14 -0.3363987
15 14  5 14 -1.3543311
16 14  6 14 -0.2146853
17 14  7 14 -0.6546186
18 14  8 14 -2.4286257
19 14  9 14 -1.3314888
20 14 10 14  0.8215581
21 14 11 14 -0.9999368
22 14 12 14 -1.2935147
23 16 10 15  0.7339261
24 16 12 15  1.1303524

The first column is the id, second column m is the month, third column y is the year, and the last column is the value.

In the month column, in the year 14, two observations (June and July) is missing and in the year 15, November is missing.

I would like to have those missing months with a value of zero. That means, for example, for the year 15, the data should look like this,

16 10 15    0.7339261
16 11 15    0
16 12 15    1.1303524

Anyone can suggest a way to do that?

Upvotes: 1

Views: 103

Answers (2)

chinsoon12
chinsoon12

Reputation: 25225

Or in data.table, generate the months for each id and year, left join this with original dataset on id, y, m and then replace NAs with 0:

library(data.table)
setDT(df)

df[df[, .(m=min(m):max(m)), by=.(id, y)], on=.(id,y,m)][
    is.na(v), v := 0]

Upvotes: 4

tmfmnk
tmfmnk

Reputation: 39858

With dplyr and tidyr, you can do:

df %>%
 group_by(id) %>%
 complete(m = seq(min(m), max(m), 1), fill = list(v = 0)) %>%
 fill(y)

      id     m     y        v
   <dbl> <dbl> <dbl>    <dbl>
 1    12     1    14  0.539  
 2    12     2    14 -0.0768 
 3    12     3    14  1.85   
 4    12     4    14 -0.855  
 5    12     5    14  0.0326 
 6    12     6    14  0      
 7    12     7    14  0      
 8    12     8    14 -1.03   
 9    12     9    14 -0.982  
10    12    10    14  0.00410
11    12    11    14 -0.233  
12    12    12    14 -0.499  
13    14     1    14  1.55   
14    14     2    14  0.0875 
15    14     3    14  1.32   
16    14     4    14 -0.981  
17    14     5    14 -0.246  
18    14     6    14 -1.40   
19    14     7    14  1.44   
20    14     8    14 -0.981  
21    14     9    14  1.47   
22    14    10    14 -0.991  
23    14    11    14 -0.0945 
24    14    12    14 -2.88   
25    16    10    15 -0.247  
26    16    11    15  0      
27    16    12    15  0.0147 

Upvotes: 1

Related Questions