Merging datasets of which only some columns match

Question

I want to merge about 27 different datasets. For each of these datasets there are many identical columns, but also many columns that are not identical. I would like the identical columns to "stack" while the non-identical columns are simply added.

Let's assume that datasets 1 and 2 look as follows:

library(data.table)
df1 <- fread(
    "A   B   C  iso   year   
     0   1   1  NLD   2009   
     1   0   2  NLD   2009   
     0   0   3  AUS   2011   
     1   0   4  AUS   2011   
     0   1   7  NLD   2008   
     1   0   1  NLD   2008   
     0   0   3  AUS   2012",
  header = TRUE
)
df2 <- fread(
    "A   B   D  E  iso   year   
     0   1   1  NA ECU   2009   
     1   0   2  0  ECU   2009   
     0   0   3  0  BRA   2011   
     1   0   4  0  BRA   2011   
     0   1   7  NA ECU   2008   
     1   0   1  0  ECU   2008   
     0   0   3  2  BRA   2012   
     1   0   4  NA BRA   2012",
  header = TRUE
)

I would now like to put these df's together with:

rbind(df1, df2, fill=TRUE)

The merged dataset should look as follows:

df_merged <- fread(
    "A   B   C  D   E   iso   year   
     0   1   1  NA  NA  NLD   2009   
     1   0   2  NA  NA  NLD   2009   
     0   0   3  NA  NA  AUS   2011   
     1   0   4  NA  NA  AUS   2011   
     0   1   7  NA  NA  NLD   2008   
     1   0   1  NA  NA  NLD   2008   
     0   0   3  NA  NA  AUS   2012   
     0   1   NA  1  NA  ECU   2009   
     1   0   NA  2  0   ECU   2009   
     0   0   NA  3  0   BRA   2011   
     1   0   NA  4  0   BRA   2011   
     0   1   NA  7  NA  ECU   2008   
     1   0   NA  1  0   ECU   2008   
     0   0   NA  3  2   BRA   2012   
     1   0   NA  4  NA  BRA   2012",",
   header = TRUE
)

However for some reason, when applying this to the actual data I get the error:

    Error in rbind(deparse.level, ...) : 
      numbers of columns of arguments do not match

I have tried to recreate the error in my example (by making the amount of columns and rows different for the df's) but failed to do so.

Any ideas?

EDIT: So apparently the issue was that the classes in my two datasets were not identical in some occasions where the column names were. I will post an update after I got it to work.

Sotos · Accepted Answer

You can use rbindlist, i.e.

library(data.table)
rbindlist(list(df1, df2), fill = TRUE)

Merging datasets of which only some columns match

Answers (2)

Related Questions