stevec
stevec

Reputation: 52268

dplyr / bigrquery way of querying/binding multiple tables of the same schema in BigQuery?

Using the methods outlined in this readme doc, it's straight forward to query a single table, like so

library(bigrquery)
library(dplyr)

natality <- tbl(con, "natality")

natality %>%
  select(year, month, day, weight_pounds) %>% 
  head(10) %>%
  collect()

This allows us to write regular dplyr code against natality, and bigrquery translates that dplyr code into a BigQuery query.

But suppose the natality table was instead 2 (or more) separate tables named natality1, natality2, and that they could be rbind'd together.

How can I do this using BigQuery? That is, how can I query these separate tables as though they are all together as one table?

What I tried

I thought bind_rows may work, but it doesn't.

library(bigrquery)
library(dplyr)

natality1 <- tbl(con, "natality1")
natality2 <- tbl(con, "natality2")

natality1 %>% bind_rows(., natality2) %>%
  select(year, month, day, weight_pounds) %>% 
  head(10) %>%
  collect()

Notes

Upvotes: 0

Views: 463

Answers (1)

Simon.S.A.
Simon.S.A.

Reputation: 6931

I am not aware of dbplyr having rbind functionality. Perhaps because not every database backend supports it?

The way I solve this problem in SQL Server is with a custom function that writes an explicit UNION ALL query (the SQL Server equivalent of rbind). Below is an example function for SQL (link to other functions from this approach). Perhaps this example can serve as inspiration for an equivalent bigquery approach?

union_all = function(table_a,table_b){
  # extract the connection
  db_connection = table_a$src$con

  sql_query = build_sql(con = db_connection,
                      sql_render(table_a), # the SQL code that defines table A
                      "\nUNION ALL\n", # insert union all command between them
                      sql_render(table_b) # the SQL code that defines table B
  )

  return(tbl(db_connection, sql(sql_query)))
}

The idea is that union_all() returns a remote table object defined by a query with the rbind equivalent command within it.

Upvotes: 1

Related Questions