foxwendy
foxwendy

Reputation: 2929

How to Get Data Location Info of a BigQuery Dataset in script

We are aware that when using bq mk command to create a dataset in BigQuery, we can use flag --data_location to specify which region we want table data under this dataset to be located in.

We are now wanting to set up a monitor so that whenever someone creates a dataset outside of our designated location, we can trig an alert to the dataset owner. In order to do this, we'll need a script that can automatically scan through all the datasets and get the location information. we looked at both api calls and bq command line tool commands, there's no clue with regarding to showing/inquirying data location of a dataset. Wondering if there's a way to accomplish our goal?

Upvotes: 8

Views: 5515

Answers (3)

intotecho
intotecho

Reputation: 5684

Add jq to extract location from the output.

bq --format="json" --project_id=<project> show  <datasetname> | jq -r '.location'

returns just the location.

australia-southeast2

If you wanted the project that owns the dataset

project=$(bq --format="json" show  <datasetname> |  jq -r '.datasetReference.projectId')

Upvotes: 3

Mikhail Berlyant
Mikhail Berlyant

Reputation: 172974

we looked at both api calls and bq command line tool commands, there's no clue with regarding to showing/inquirying data location of a dataset. Wondering if there's a way to accomplish our goal?

You can use API to accomplish this:

With Datasets: list API you can lists all datasets in the specified project

And then, with Datasets: get API you can returns the dataset specified by datasetID and check dataset's location property

Upvotes: 1

Pentium10
Pentium10

Reputation: 207838

To get all your datasets in the current project:

bq ls -d --format=json

If you run

bq show --format=json <dataset_name>

you get back a JSON that contains the location key:

{  
   "kind":"bigquery#dataset",
   "datasetReference":{  
      "projectId":"<edited>",
      "datasetId":"wr_temp"
   },
   "creationTime":"1479393712602",
   "access":[  
      {  
         "specialGroup":"projectWriters",
         "role":"WRITER"
      },
      {  
         "specialGroup":"projectOwners",
         "role":"OWNER"
      },
      {  
         "role":"OWNER",
         "userByEmail":"<edited>"
      },
      {  
         "specialGroup":"projectReaders",
         "role":"READER"
      }
   ],
   "defaultTableExpirationMs":"604800000",
   "etag":"<edited>",
   "location":"US",
   "lastModifiedTime":"1479393712602",
   "id":"<edited>",
   "selfLink":"https://www.googleapis.com/bigquery/v2/projects/<edited>"
}

Also regarding API, if you run the dataset's GET call you get back the same JSON. https://cloud.google.com/bigquery/docs/reference/rest/v2/datasets/get#try-it

Upvotes: 5

Related Questions