Phil_T
Phil_T

Reputation: 1002

r - dplyr full_join using column position

I the following dataframes:

a <- c(1,1,1)
b<- c(10,8,2)
c<- c(2,2)
d<- c(3,5)

AB<- data.frame(a,b)
CD<- data.frame(c,d)

I would like to join AB and CD, where the first column of CD is equal to the second column of AB. Please note that my actual data will have a varying number of columns, with varying names, so I am really looking for a way to join based on position only. I have been trying this:

#Get the name of the last column in AB
> colnames(AB)[ncol(AB)]
[1] "b"
#Get the name of the first column in CD
> colnames(CD)[1]
[1] "c"

Then I attempt to join like this:

> abcd <- full_join(AB, CD, by = c(colnames(AB)[ncol(AB)]=colnames(CD)[1]))
Error: unexpected '=' in "abcd <- full_join(AB, CD, by = c(colnames(AB)[ncol(AB)]="

The behavior I am looking for is essentially this:

> abcd<- full_join(AB, CD, by = c("b" = "c"))
> abcd
  a  b  d
1 1 10 NA
2 1  8 NA
3 1  2  3
4 1  2  5

Upvotes: 2

Views: 1396

Answers (2)

akrun
akrun

Reputation: 887128

We can do setNames

full_join(AB, CD, setNames(colnames(CD)[1], colnames(AB)[ncol(AB)]))
#  a  b  d
#1 1 10 NA
#2 1  8 NA
#3 1  2  3
#4 1  2  5

Upvotes: 6

www
www

Reputation: 39154

We can replace the target column names with a common name, such as "Target", and then do full_join. Finally, replace the "Target" name with the original column name.

library(dplyr)

AB_name <- names(AB)
target_name <- AB_name[ncol(AB)] # Store the original column name

AB_name[ncol(AB)] <- "Target" # Set a common name
names(AB) <- AB_name

CD_name <- names(CD)
CD_name[1] <- "Target" # Set a common name
names(CD) <- CD_name

abcd <- full_join(AB, CD, by = "Target") %>% # Merge based on the common name
  rename(!!target_name := Target) # Replace the common name with the original name
abcd
#   a  b  d
# 1 1 10 NA
# 2 1  8 NA
# 3 1  2  3
# 4 1  2  5

Upvotes: 1

Related Questions