Reputation: 123
I have the following sets of data:
df1 <- data.frame( country = c("A", "B","A","B"), year = c(2011,2011,2012,2012), variable_1= c(1,3,5,7))
df2 <- data.frame( country = c("A", "B","A","B"), year = c(2011,2012,2012,2013), variable_2= c(2,4,6,8))
df3 <- data.frame( country = c("A", "C","C"), year = c(2011,2011,2013), variable_3= c(9,9,9))
I want to reshape them into a panel data model, so I can get the following result:
df4 <- data.frame( country = c("A","A","A","B","B","B","C","C","C"), year = c(2011,2012,2013,2011,2012,2013,2011,2012,2013), variable_1 = c(1,5,NA,3,7,NA,NA,NA,NA), variable_2 = c(2,6,NA,NA,4,8,NA,NA,NA), variable_3 = c(9,NA,NA,NA,NA,NA,9,NA,9) )
I have searched for this info, but the topics I found (Reshaping panel data) didn´t help me.
Any ideas on how to do that? My real data sets have thousands of lines ("countries"), several variables, years and NA´s, so please take that into account.
Upvotes: 2
Views: 80
Reputation: 21641
Try
library(tidyr)
library(dplyr)
Reduce(full_join, list(df1, df2, df3)) %>%
complete(country, year)
Which gives:
#Source: local data frame [9 x 5]
#
# country year variable_1 variable_2 variable_3
# (chr) (dbl) (dbl) (dbl) (dbl)
#1 A 2011 1 2 9
#2 A 2012 5 6 NA
#3 A 2013 NA NA NA
#4 B 2011 3 NA NA
#5 B 2012 7 4 NA
#6 B 2013 NA 8 NA
#7 C 2011 NA NA 9
#8 C 2012 NA NA NA
#9 C 2013 NA NA 9
Upvotes: 5