Reputation: 113
I have three tables: CUSTOMER (PK: CustomerNum), RESERVATION (PK: CustomerNum and TripID NOT PK), and TRIP (PK: TripID). Trying to use purr:reduce.
Tried the following code.
tables <- list(customer, reservation, trip)
reduce(tables, inner_join, by = c("CustomerNum", "TripID"))
Error: by
can't contain join column TripID
which is missing from LHS
Upvotes: 0
Views: 109
Reputation: 886948
We can use a for
loop when the by
columns are different in each step
grp <- c("CustomerNum", "TripID")
out <- customer
for(i in seq_along(grp)) {
out <- inner_join(out, tables[[i+1]], by = grp[i])
}
out
# CustomerNum val TripID newInfo
#1 1 -0.5458808 4 *
#2 2 0.5365853 2 ****
then select
the columns of interest
Note that if we don't provide the by
, it would automatically pick up the by
based on the availability of common column name based on the reproducible example below. As the OP didn't provide any reproducible example, it is unclear about the situation
reduce(tables, inner_join)
#Joining, by = "CustomerNum" #### <-----
#Joining, by = "TripID" #### <-----
# CustomerNum val TripID newInfo
#1 1 -0.5458808 4 *
#2 2 0.5365853 2 ****
set.seed(24)
customer <- data.frame(CustomerNum = 1:5, val = rnorm(5))
reservation <- data.frame(CustomerNum = 1:3, TripID = c(4, 2, 8))
trip <- data.frame(TripID = c(4, 9, 2), newInfo = c("*", "**", "****"))
tables <- list(customer, reservation, trip)
Upvotes: 1