Reputation: 2929
We are aware that when using bq mk
command to create a dataset in BigQuery, we can use flag --data_location
to specify which region we want table data under this dataset to be located in.
We are now wanting to set up a monitor so that whenever someone creates a dataset outside of our designated location, we can trig an alert to the dataset owner. In order to do this, we'll need a script that can automatically scan through all the datasets and get the location information. we looked at both api calls and bq command line tool commands, there's no clue with regarding to showing/inquirying data location of a dataset. Wondering if there's a way to accomplish our goal?
Upvotes: 8
Views: 5515
Reputation: 5684
Add jq to extract location
from the output.
bq --format="json" --project_id=<project> show <datasetname> | jq -r '.location'
returns just the location.
australia-southeast2
If you wanted the project that owns the dataset
project=$(bq --format="json" show <datasetname> | jq -r '.datasetReference.projectId')
Upvotes: 3
Reputation: 172974
we looked at both api calls and bq command line tool commands, there's no clue with regarding to showing/inquirying data location of a dataset. Wondering if there's a way to accomplish our goal?
You can use API to accomplish this:
With Datasets: list
API you can lists all datasets in the specified project
And then, with Datasets: get
API you can returns the dataset specified by datasetID and check dataset's location
property
Upvotes: 1
Reputation: 207838
To get all your datasets in the current project:
bq ls -d --format=json
If you run
bq show --format=json <dataset_name>
you get back a JSON that contains the location
key:
{
"kind":"bigquery#dataset",
"datasetReference":{
"projectId":"<edited>",
"datasetId":"wr_temp"
},
"creationTime":"1479393712602",
"access":[
{
"specialGroup":"projectWriters",
"role":"WRITER"
},
{
"specialGroup":"projectOwners",
"role":"OWNER"
},
{
"role":"OWNER",
"userByEmail":"<edited>"
},
{
"specialGroup":"projectReaders",
"role":"READER"
}
],
"defaultTableExpirationMs":"604800000",
"etag":"<edited>",
"location":"US",
"lastModifiedTime":"1479393712602",
"id":"<edited>",
"selfLink":"https://www.googleapis.com/bigquery/v2/projects/<edited>"
}
Also regarding API, if you run the dataset's GET call you get back the same JSON. https://cloud.google.com/bigquery/docs/reference/rest/v2/datasets/get#try-it
Upvotes: 5