duuug
duuug

Reputation: 45

Syntax Error with Validated Query when sent into BigQuery via Python API Client

Here is my query:

SELECT hits.page.pagePath
FROM [(project_id):(dataset_id).ga_sessions_20151019]
GROUP BY hits.page.pagePath LIMIT 1

It runs in the web UI.

Here is my code:

from oauth2client.service_account import ServiceAccountCredentials
from httplib2 import Http
from apiclient.discovery import build
import json

query = "SELECT hits.page.pagePath FROM [(project_id):(dataset_id).ga_sessions_20151019] GROUP BY hits.page.pagePath LIMIT 1",

path = (filepath of credentials json file)
scopes = ['https://www.googleapis.com/auth/bigquery']
credentials = ServiceAccountCredentials.from_json_keyfile_name(path,scopes)
http_auth = credentials.authorize(Http())
bigquery = build('bigquery','v2',http=http_auth)

req_body = {
    "timeoutMs": 60000,
    "kind": "bigquery#queryRequest",
    "dryRun": False,
    "useQueryCache": True,
    "useLegacySql": False,
    "maxResults": 100,
    "query": query,
    "preserveNulls": True,
  }

bigquery.jobs().query(projectId=(project_id),body=req_body).execute()

When I run this, I get the following error:

HttpError: <HttpError 400 when requesting https://www.googleapis.com/bigquery/v2/projects/cardinal-path/queries?alt=json returned "Syntax error: Unexpected "["">

It doesn't seem to like the brackets in my query string, but I don't know how to escape them (if that is the issue). Does anyone see what I'm doing wrong? I don't thing it's an issue with my connection to the API because I am able to see all the jobs I've started (which have all failed due to the above HttpError / Syntax Error) by calling the service object's ('bigquery' above) jobs().list() function. Thanks!

Upvotes: 1

Views: 12234

Answers (1)

Michael Sheldon
Michael Sheldon

Reputation: 2057

I see you are setting useLegacySql to False in your query request.

Bracket-quoting literals like [projectid:datasetid.tableid] is part of the legacy BigQuery SQL dialect.

The new sql dialect uses back-ticks to quote literals. So try:

SELECT hits.page.pagePath FROM `project_id:dataset_id.ga_sessions_20151019` GROUP BY hits.page.pagePath LIMIT 1

Alternately, since you are passing project_id as the project you are running the job in, all dataset lookups will resolve to that project by default, so you can drop the projectid: prefix and just use datasetid.tableid like:

SELECT hits.page.pagePath FROM dataset_id.ga_sessions_20151019 GROUP BY hits.page.pagePath LIMIT 1

While this is convenient for user-typed queries, if all your queries are code generated it is probably safest to always use quoted fully-qualified referenced.

Update: Another alternative is to use SQL's standard dot separator with non legacy SQL dialect, i.e.

SELECT hits.page.pagePath 
FROM project_id.dataset_id.ga_sessions_20151019
GROUP BY hits.page.pagePath LIMIT 1

Upvotes: 4

Related Questions