Vishal John
Vishal John

Reputation: 4382

Bigquery Api Java client intermittently returning bad results

I am executing some long running quires using the big-query java client.

I construct a big-query job and execute like this

val queryRequest = new QueryRequest().setQuery(query)
val queryJob = client.jobs().query(ProjectId, queryRequest)
queryJob.execute()

The problem I am facing is the for the same query, the client returns before the job is complete i.e. the number of rows in result is zero.

I tried printing the response and it shows

{"jobComplete":false,"jobReference":{"jobId":"job_bTLRGrw5_xR26i9Li3a9EQvuA6c","projectId":"analytics-production"},"kind":"bigquery#queryResponse"}

From that I can see that the job is not complete. The why did the client return before the job is complete ?

While building the client, I use the HttpRequestInitializer and in the initialize method I provide the timeout parameters.

override def initialize(request: HttpRequest): Unit = {
  request.setConnectTimeout(...)
  request.setReadTimeout(...)
}

Tried giving high values for timeout like 240 seconds etc..but no luck. The behavior is still the same. It fails intermitently.

Upvotes: 1

Views: 300

Answers (1)

Pentium10
Pentium10

Reputation: 208042

Make sure you set the timeout on the Bigquery request body, and not the HTTP object.

val queryRequest = new QueryRequest().setQuery(query).setTimeoutMs(10000) //10 seconds

The param is timeoutMs. This is documented here: https://cloud.google.com/bigquery/docs/reference/v2/jobs/query

Please also read the docs regarding this field: How long to wait for the query to complete, in milliseconds, before the request times out and returns. Note that this is only a timeout for the request, not the query. If the query takes longer to run than the timeout value, the call returns without any results and with the 'jobComplete' flag set to false. You can call GetQueryResults() to wait for the query to complete and read the results. The default value is 10000 milliseconds (10 seconds).

More about Synchronous queries here
https://cloud.google.com/bigquery/querying-data#syncqueries

Upvotes: 2

Related Questions