Reputation: 52268
Using the methods outlined in this readme doc, it's straight forward to query a single table, like so
library(bigrquery)
library(dplyr)
natality <- tbl(con, "natality")
natality %>%
select(year, month, day, weight_pounds) %>%
head(10) %>%
collect()
This allows us to write regular dplyr
code against natality
, and bigrquery
translates that dplyr
code into a BigQuery query.
But suppose the natality
table was instead 2 (or more) separate tables named natality1
, natality2
, and that they could be rbind
'd together.
How can I do this using BigQuery? That is, how can I query these separate tables as though they are all together as one table?
I thought bind_rows
may work, but it doesn't.
library(bigrquery)
library(dplyr)
natality1 <- tbl(con, "natality1")
natality2 <- tbl(con, "natality2")
natality1 %>% bind_rows(., natality2) %>%
select(year, month, day, weight_pounds) %>%
head(10) %>%
collect()
Upvotes: 0
Views: 463
Reputation: 6931
I am not aware of dbplyr
having rbind
functionality. Perhaps because not every database backend supports it?
The way I solve this problem in SQL Server is with a custom function that writes an explicit UNION ALL
query (the SQL Server equivalent of rbind
). Below is an example function for SQL (link to other functions from this approach). Perhaps this example can serve as inspiration for an equivalent bigquery
approach?
union_all = function(table_a,table_b){
# extract the connection
db_connection = table_a$src$con
sql_query = build_sql(con = db_connection,
sql_render(table_a), # the SQL code that defines table A
"\nUNION ALL\n", # insert union all command between them
sql_render(table_b) # the SQL code that defines table B
)
return(tbl(db_connection, sql(sql_query)))
}
The idea is that union_all()
returns a remote table object defined by a query with the rbind
equivalent command within it.
Upvotes: 1