Reputation: 311
I am trying trying to use dplyr to access/manipulate a table view in Google BigQuery. However when I try to aggregate the table using dplyr functions (select, filter, etc.) I get this error:
Error: Cannot reference a SQL view in a Legacy SQL query
The code below is to illustrate what I am trying to do.
#source the table
pd = src_bigquery(project, dataset) %>%
tbl(table)
#get a aggregated view of the table that is filtered on a specific date
pdSelect = pd %>%
select(id, date) %>%
filter(date =="2017-03-15") %>%
collect()
Is there a way to not use legacy SQL when using dplyr? For example, in the Google Bigquery web UI it says "By default, BigQuery runs queries using legacy SQL. Uncheck this to run queries using BigQuery's updated SQL dialect with improved standards compliance." When I am working in that environment I typically just unselect it so it stops using legacy SQL.
Thanks for your help!
Upvotes: 1
Views: 3943
Reputation: 43
As well as passing in the shebangs you can set the use_legacy_sql
flags into bigrquery
and dplyr
connectors. This can be useful if you're accessing a view that was built with standard SQL.
bigrquery
Pass the use_legacy_sql flag to query_exec
:
bigrquery::query_exec("SELECT * FROM `datasetName.standardSQLView`",
project ='bQProjectName',
use_legacy_sql = FALSE)
dplyr
Pass the use_legacy_sql flag to dbConnect
:
bQCon = DBI::dbConnect(bigrquery::dbi_driver(),
project = 'bQProjectName',
dataset = 'datasetName',
use_legacy_sql = FALSE)
dplyr::tbl(bQCon, 'standardSQLView')
Upvotes: 0
Reputation: 33765
Edit: it should be possible to use standard SQL by putting the #standardSQL
shebang at the top of your query.
Original response:
It looks like there is no option in the source for the BigQuery connector with dplyr to use standard SQL, and the #standardSQL
shebang doesn't work. There was an issue submitted in relation to useLegacySql that you could upvote or comment on, or you could consider submitting a pull request.
Upvotes: 3