Michael Ames
Michael Ames

Reputation: 2617

How retrieve the user and SQL code associated with a BigQuery job?

For security purposes, I'd like to be able to audit who is running what kinds of queries within a certain project. Is this possible?

Using bq ls -j from the command line gives some info; bg show -j gives a little more. But neither show the user or the query itself.

Upvotes: 4

Views: 2086

Answers (2)

Sander van den Oord
Sander van den Oord

Reputation: 12808

You also use sql to query the INFORMATION_SCHEMA to get user and query.

For example like this:

SELECT 
    creation_time, 
    job_type, 
    query, 
    user_email, 
    job_id, 
    parent_job_id, 
    cache_hit, 
    total_bytes_processed, 
    total_bytes_billed
FROM `your_project_id.region-eu.INFORMATION_SCHEMA.JOBS`
WHERE creation_time >= '2023-01-11'  -- table is partitioned on creation_time

See also:
https://cloud.google.com/bigquery/docs/information-schema-jobs

Upvotes: 1

Pentium10
Pentium10

Reputation: 207912

In the CLI, you can run bq ls -j -a to retrieve jobs for all users in a project.

Then you can run for each job id a bq show -j <job_id> and in order to have more details you will choose to use the json response:

bq show --format=prettyjson -j job_joQEqPwOiOoBlOhDBEgKxQAlKJQ

this returns the following format which have your query, your user and bytesprocessed etc...

{
  "configuration": {
    "dryRun": false, 
    "query": {
      "createDisposition": "CREATE_IF_NEEDED", 
      "destinationTable": {
        "datasetId": "", 
        "projectId": "", 
        "tableId": ""
      }, 
      "query": "", 
      "writeDisposition": "WRITE_TRUNCATE"
    }
  }, 
  "etag": "", 
  "id": "", 
  "jobReference": {
    "jobId": "", 
    "projectId": ""
  }, 
  "kind": "bigquery#job", 
  "selfLink": "", 
  "statistics": {
    "creationTime": "1435006022346", 
    "endTime": "1435006144730", 
    "query": {
      "cacheHit": false, 
      "totalBytesProcessed": "105922683030"
    }, 
    "startTime": "1435006023171", 
    "totalBytesProcessed": "105922683030"
  }, 
  "status": {
    "state": "DONE"
  }, 
  "user_email": ""
}

Using the API you need to pass allUsers property to list jobs from all users https://cloud.google.com/bigquery/docs/reference/v2/jobs/list#allUsers

Upvotes: 5

Related Questions