Reputation: 327
What is the R equivalent of this?
SELECT <select_list>
FROM Table_A A
LEFT JOIN Table_B B
ON A.Key = B.Key
WHERE B.Key IS NULL
See https://www.codeproject.com/articles/33052/visual-representation-of-sql-joins - 'Left excluding JOIN'
I have two columns of data:
A
Amsterdam
Copenhagen
LA
Lisbon
London
Madrid
New York
Paris
Rome
Stockholm
and
B
Amsterdam
Buenos Aires
Copenhagen
LA
London
Sydney
Tokyo
How do I return the lists:
In A ONLY
Lisbon
Madrid
New York
Paris
Rome
Stockholm
and
In B ONLY
Buenos Aires
Sydney
Tokyo
in R?
I've been trying and searching for solutions using the merge()
function for about three hours.
Similarly, is there a way to do a left join in R that returns both columns?
For example, if I call:
A <- data.frame(A = c("Amsterdam",
"Copenhagen",
"LA",
"Lisbon",
"London",
"Madrid",
"New York",
"Paris",
"Rome",
"Stockholm"))
B <- data.frame(B = c("Amsterdam",
"Buenos Aires",
"Copenhagen",
"LA",
"London",
"Sydney",
"Tokyo"))
merge(A, B, by.x = "A", by.y = "B", all.x = T)
I get
A
1 Amsterdam
2 Copenhagen
3 LA
4 Lisbon
5 London
6 Madrid
7 New York
8 Paris
9 Rome
10 Stockholm
But I want
A B
1 Amsterdam Amsterdam
2 Copenhagen Copenhagen
3 LA LA
4 Lisbon NULL
5 London London
6 Madrid NULL
7 New York NULL
8 Paris NULL
9 Rome NULL
10 Stockholm NULL
Upvotes: 0
Views: 3599
Reputation: 1
Sorry, my first comment on here so I dont know how to correctly format answer to look like R yet
Use anti_join(A,B) and the anti_join(B,A). The results:
anti_join(A,B)
Joining, by = "Y"
Y
1 Lisbon
2 Madrid
3 New York
4 Paris
5 Rome
6 Stockholmanti_join(B,A)
Joining, by = "Y"
Y
1 Buenos Aires
2 Sydney
3 Tokyo
And to merge them two by matches:
aux<- inner_join(A,B)
aux2<- anti_join(B,A)
i<-1:nrow(A)
c<-ifelse(A$Y[i] %in% aux$Y, A$Y[i],"NA")
A<-cbind(A,c)
A<-rbind(A,data.frame(Y=rep("NA",nrow(anti_join(B,A))),c=aux2$Y))
A
Y c
1 Amsterdam Amsterdam
2 Copenhagen Copenhagen
3 LA LA
4 Lisbon NA
5 London London
6 Madrid NA
7 New York NA
8 Paris NA
9 Rome NA
10 Stockholm NA
11 NA Buenos Aires
12 NA Sydney
13 NA Tokyo
Upvotes: 0
Reputation: 1
For excluding the city mentioned in b
from a
. a
is left table and b
is right table. the below is similar to left excluding join. Left excluding join = Left join - inner join
.
a <- data.frame(a);
colnames(a) <- city;
b <- data.frame(b);
colnames(b) <- city;
b$indicator <- c("indicator");
a_excl_b <- merge(x = a, y = b, all = FALSE, all.x = TRUE, all.y = FALSE);
a_excl_b <- subset(a_excl_b, is.na(indicator));
a_excl_b <- data.frame(a_excl_b$city);
colnames(a_excl_b ) <- c("city");
Upvotes: 0
Reputation: 33488
You could do something like the following:
df <- data.frame(A, B = "NULL", stringsAsFactors = FALSE)
isinB <- df$A %in% B$B
df[isinB, "B"] <- as.character(df[isinB, "A"])
print(df)
A B
1 Amsterdam Amsterdam
2 Copenhagen Copenhagen
3 LA LA
4 Lisbon NULL
5 London London
6 Madrid NULL
7 New York NULL
8 Paris NULL
9 Rome NULL
10 Stockholm NULL
Upvotes: 1
Reputation: 887068
We create a column of 'A' in 'B' and then do the merge
merge(A, transform(B, A = B), all.x = TRUE)
# A B
#1 Amsterdam Amsterdam
#2 Copenhagen Copenhagen
#3 LA LA
#4 Lisbon <NA>
#5 London London
#6 Madrid <NA>
#7 New York <NA>
#8 Paris <NA>
#9 Rome <NA>
#10 Stockholm <NA>
Upvotes: 1
Reputation: 37641
## A only
setdiff(A,B)
[1] "Lisbon" "Madrid" "New York" "Paris" "Rome" "Stockholm"
## B only
setdiff(B,A)
[1] "Buenos Aires" "Sydney" "Tokyo"
Upvotes: 2