mdivk
mdivk

Reputation: 3727

How do I retrieve info of a job associated with a Google BigQuery execution on a public dataset?

I ran a query against BigQuery from the Cloud Shell:

xenonxie@cloudshell:~ (welynx)$ bq query --dry_run "SELECT COUNT(1) as rowcount, COUNTIF(corpus = 'hamlet') as rowcount_hamlet FROM publicdata.samples.shakespeare order by rowcount_hamlet desc"
Query successfully validated. Assuming the tables are not modified, running this query will process 2464625 bytes of data.
xenonxie@cloudshell:~ (welynx)$ bq query "SELECT COUNT(1) as rowcount, COUNTIF(corpus = 'hamlet') as rowcount_hamlet FROM publicdata.samples.shakespeare order by rowcount_hamlet desc"
Waiting on bqjob_r152b89ff4ea17df1_0000016faa8d1546_1 ... (0s) Current status: DONE   
+----------+-----------------+
| rowcount | rowcount_hamlet |
+----------+-----------------+
|   164656 |            5318 |
+----------+-----------------+

I can see that there is a job associated with it:

xenonxie@cloudshell:~ (welynx)$ bq ls -j -a
                    jobId                      Job Type    State      Start Time         Duration
 -------------------------------------------- ---------- --------- ----------------- ----------------
  bqjob_r152b89ff4ea17df1_0000016faa8d1546_1   query      SUCCESS   15 Jan 13:52:50   0:00:00.886000

Now, I want to retrieve the job's details as described in the BigQuery REST API documentation here:

https://bigquery.googleapis.com/bigquery/v2/projects/bqjob_r152b89ff4ea17df1_0000016faa8d1546_1/jobs

However, I received the error below:

xenonxie@cloudshell:~ (welynx)$ wget https://bigquery.googleapis.com/bigquery/v2/projects/bqjob_r152b89ff4ea17df1_0000016faa8d1546_1/jobs
--2020-01-15 15:10:23--  https://bigquery.googleapis.com/bigquery/v2/projects/bqjob_r152b89ff4ea17df1_0000016faa8d1546_1/jobs
Resolving bigquery.googleapis.com (bigquery.googleapis.com)... 173.194.217.95, 2607:f8b0:400c:c0d::5f
Connecting to bigquery.googleapis.com (bigquery.googleapis.com)|173.194.217.95|:443... connected.
HTTP request sent, awaiting response... 401 Unauthorized

Username/Password Authentication Failed.

Can anyone enlighten me as to what is going wrong here and how do I fix it? Thank you very much.

Upvotes: 0

Views: 785

Answers (1)

Maxim
Maxim

Reputation: 4441

There are a couple things that don't look right here:

  • While you might be able to use wget to call the BigQuery REST API, I do not recommend that method - the preferred, most commonly used and easy way to go method is by using cURL instead.
  • The BigQuery REST API method you linked is for listing all BigQuery jobs, not the details for a single one as you're asking. Moreover, the requested URL must be in the form of https://bigquery.googleapis.com/bigquery/v2/projects/{projectId}/jobs as mentioned there - you've omitted {projectId} and placed the job ID in its place instead.
  • Regardless, to view the details of single job ID, the BigQuery REST API method jobs.get should be used instead, where the URL must be in the form of https://bigquery.googleapis.com/bigquery/v2/projects/{projectId}/jobs/{jobId}, replacing {projectId} with your project ID and {jobId} with the BigQuery job ID.
  • To call the REST API using cURL you must provide some form of authentication. One way is to use the access token, which can be obtained by running gcloud auth print-access-token from i.e the Cloud Shell.

Given my previous points, and to summarize, the REST API call that should give you what you're looking for (using cURL) should like like this in the end:

curl -H "Authorization: Bearer "$(gcloud auth print-access-token) \
https://bigquery.googleapis.com/bigquery/v2/projects/{projectId}/jobs/bqjob_r152b89ff4ea17df1_0000016faa8d1546_1

Important: make sure you provide the project ID as a parameter, by replacing {projectId} (including the curly braces) accordingly, where the BigQuery job was created. Also, I copy-pasted the job ID from your question there, so depending on the job you want to check for, the last part should also be modified to your needs.

Upvotes: 2

Related Questions