user1617979
user1617979

Reputation: 2480

How do I select all data.table columns that are in a second data.table

I have several data.tables that have the same columns, and one that has some extra columns. I want to rbind them all but only on the common columns

with data.frames I could simply do

 rbind(df1[,names(df2)],df2,df3,...)

I can of course write all the column names in the form

  list(col1,col2,col3,col4) 

but this is not elegant, nor feasible if one has 1,000 variables

I am sure there is a way and I am not getting there - any help would be appreciated

Upvotes: 0

Views: 72

Answers (1)

akrun
akrun

Reputation: 887108

May be you can try:

DT1 <- data.table(Col1=1:5, Col2=6:10, Col3=2:6)
DT2 <- data.table(Col1=1:4, Col3=2:5)
DT3 <- data.table(Col1=1:7, Col3=1:7)
lst1 <- mget(ls(pattern="DT\\d+"))

ColstoRbind <- Reduce(`intersect`,lapply(lst1, colnames))
# .. "looks up one level"
res <- rbindlist(lapply(lst1, function(x) x[, ..ColstoRbind]))
res
#    Col1 Col3
# 1:    1    2
# 2:    2    3
# 3:    3    4
# 4:    4    5
# 5:    5    6
# 6:    1    2
# 7:    2    3
# 8:    3    4
# 9:    4    5
#10:    1    1
#11:    2    2
#12:    3    3
#13:    4    4
#14:    5    5
#15:    6    6
#16:    7    7

Update

As @Arun suggested in the comments, this might be better

rbindlist(lapply(lst1, function(x) {
          if(length(setdiff(colnames(x), ColstoRbind))>0) {  
               x[, ..ColstoRbind]
               }
              else x}))

Upvotes: 3

Related Questions