Matching multiple tables (dataframes) using ID

Question

I have the following 3 tables:

ID	H2
3254D	YUB
1011A	WSX
1089	WER
1043	RFR

ID	Q2
1021A	QAZ
1089	EDR
3254D	DFH
1043	RFV

ID	X2
1011A	WSX
1569V	PKJ
1089	UIO
3254D	BHY

As a newbie to R,I am looking for a way to find the intersecting columns in these 3 tables using the ID column. The desired output I am looking at are the following 3 tables:

ID	H2
1089	WER
3254D	YUB

ID	Q2
1089	EDR
3254D	DFH

ID	X2
1089	UIO
3254D	BHY

Please also note, that I also have a situation where it can be more than 3 tables, i.e., the solution should be able to generalize to more than 3 tables. Appreciate any suggestions. Cheers.

akrun · Accepted Answer

We can use intersect to get the intersecting 'ID' from all the datasets . Then, use that to subset the individual datasets

ids <- Reduce(intersect, list(df1$ID, df2$ID, df3$ID))
subdf1 <- subset(df1, ID %in% ids)
subdf2 <- subset(df2, ID %in% ids)
subdf3 <- subset(df3, ID %in% ids)

-output

subdf1
#     ID  H2
#1 3254D YUB
#3  1089 WER
subdf2
#     ID  Q2
#2  1089 EDR
#3 3254D DFH

subdf3
#     ID  X2
#3  1089 UIO
#4 3254D BHY

If there are many datasets, and the object names follow a pattern i.e. 'df' followed by some digits, then use mget to get them into a list

lst1 <- mget(ls(pattern = '^df\d+$'))

Get the common intersecting elements after extracting the 'ID' column

ids <- Reduce(intersect, lapply(lst1, `[[`, "ID"))

and subset the list elements and keep it in a list

lst2 <- lapply(lst1, subset, subset = ID %in% ids)

-output

lst2
#$df1
#     ID  H2
#1 3254D YUB
#3  1089 WER

#$df2
#     ID  Q2
#2  1089 EDR
#3 3254D DFH

#$df3
#     ID  X2
#3  1089 UIO
#4 3254D BHY

data

df1 <- structure(list(ID = c("3254D", "1011A", "1089", "1043"), H2 = c("YUB", 
"WSX", "WER", "RFR")), class = "data.frame", row.names = c(NA, 
-4L))

df2 <- structure(list(ID = c("1021A", "1089", "3254D", "1043"), Q2 = c("QAZ", 
"EDR", "DFH", "RFV")), class = "data.frame", row.names = c(NA, 
-4L))

df3 <- structure(list(ID = c("1011A", "1569V", "1089", "3254D"), X2 = c("WSX", 
"PKJ", "UIO", "BHY")), class = "data.frame", row.names = c(NA, 
-4L))

Matching multiple tables (dataframes) using ID

Answers (1)

data

Related Questions