NonSleeper
NonSleeper

Reputation: 851

How can I fill values from duplicates in Stata?

The original data looks like this (Date is in dd/mm/yyyy format):

ID    Date          Var  
01    21/01/2016    1  
01    22/01/2016      
02    13/05/2016      
02    14/05/2016    2  
03    08/06/2016    3  
03    08/06/2016    4  
04    09/07/2016    5  
04    09/07/2016    
05    10/08/2016    
05    10/08/2016    6

In essence, for each ID there are three scenarios for the variable of interest Var in regards to the date of submission Date:

1/ Var present on the earlier Date but missing on the later one (ID=01)
2/ Var missing on the earlier Date but present on the later one (ID=02)
3/ Var present on both Date, whose values may be the same or different (ID=03).

Now how can I modify Var as follows?

1/ Use value of Var of the earlier Date to fill up the later one.
2/ Use value of Var of the later Date to fill up the earlier one.
3/ Use value of Var of the later Date to replace the earlier one.

So the modified data should look like:

ID    Date          Var  
01    21/01/2016    1  
01    22/01/2016    1  
02    13/05/2016    2  
02    14/05/2016    2  
03    08/06/2016    4  
03    08/06/2016    4  

Upvotes: 0

Views: 488

Answers (1)

Nick Cox
Nick Cox

Reputation: 37278

Your examples are solved by

bysort ID (Var): replace Var = Var[1] if missing(Var) 
bysort ID (Date): replace Var = Var[_N] 

Replacing missing values in series using nearby known values is called interpolation. See e.g. here for notice of an interpolation command supporting the first method more generally (and other methods too).

Upvotes: 1

Related Questions