Reputation: 1498
I'm trying to learn a bit of R, and can't figure out how to join 2 tables with dplyr. If I did this strictly in SQL, I would use:
select a1.[some column], a2.[other column], a3.[last column]
from vwTable1 a1
join vwTable2 a2
on a2.myKey = a1.mykey
join vwTable3 a3
on a3.otherKey = a1.otherKey
where a2.myItem = '12345'
and a1.myDate between '1/1/2021' and '3/31/2021'
In R, I can connect to the SQL database:
library(odbc)
library(dplyr)
con <- dbConnect(odbc(),
Driver = "SQL Server",
Server = "myServer",
Database = "myDatabase",
Port = "1234",
Trusted_Connection = "Yes")
I'm just not sure if I need to establish references first, like:
a1 <- tbl(con, "vwTable1")
a2 <- tbl(con, "vwTable2")
a3 <- tbl(con, "vwTable3")
...and then join the 2 references?
The goal is to put the query results into a data frame (if that's the best way to do it) that will be copied to a SQL table on another server.
I did try the following, but R Console just sat there at a + waiting for me to do something else:
a4 <- tbl(con, sql("SELECT a1.[some column], a2.[other column], a2.[last column]
FROM vwTable1 a1
JOIN vwTable2 a2 on a2.myKey = a1.myKey
JOIN vwTable3 a3 on a3.otherKey = a1.otherKey
WHERE a2.myItem = '12345'
AND a1.myDate BETWEEN '1/1/2021' AND '3/31/2021'))
EDIT:
This was the solution:
results <- tbl(con, "vwTable1") %>% filter(myDate >= "2021-01-01" & myDate <= "2021-03-31") %>%
inner_join(tbl(con, "vwTable2") %>% filter(myItem == '12345'), by = "myKey") %>%
inner_join(tbl(con, "vwTable3"), by = "otherKey")
Upvotes: 0
Views: 197
Reputation: 66415
The dplyr syntax would be something like
tbl(con, "vwTable1") %>%
filter(myDate >= "2021-01-01" & myDate <= "2021-01-02") %>% # edit
inner_join(tbl(con, "vwTable2") %>%
filter(myItem == '12345'),
by = "myKey") %>%
inner_join(tbl(con, "vwTable3"), by = "otherKey")
# Edit - I think the default is inner_join, not left_join as I had originally
Upvotes: 2