Reputation: 699
I have list of bigquery tables to be created using terraform but I need only the partition for specific tables.
Here is the ex.
locals {
path = "../../../../../../../../db"
gcp_bq_tables = [
"my_table1",
"my_table1_daily",
"my_table2",
"my_table2_daily"
]
}
And, the terraform script to create the tables:
resource "google_bigquery_table" "gcp_bq_tables" {
for_each = toset(local.gcp_bq_tables)
dataset_id = google_bigquery_dataset.gcp_bq_db.dataset_id
table_id = each.value
schema = file("${local.path}/schema/${each.value}.json")
labels = {
env = var.env
app = var.app
}
}
In that I need to create partition on timestamp, type as DAY but the columns are different. Lets say for my_table1,
How to write the terraform script in this scenario.
I find a way to do it in terraform_documentation but not sure for multiple tables and how can be specified the partition columns for both tables.
Upvotes: 0
Views: 3343
Reputation: 6582
I hope this solution can help.
You can configure a json file to create dynamically your tables with partitions.
tables.json
file
{
"tables": {
"my_table1": {
"dataset_id": "my_dataset",
"table_id": "my_table",
"schema_path": "folder/myschema.json",
"partition_type": "DAY",
"partition_field": "partitionField",
"clustering": [
"field",
"field2"
]
},
"my_table2": {
"dataset_id": "my_dataset",
"table_id": "my_table2",
"schema_path": "folder/myschema2.json",
"partition_type": "DAY",
"partition_field": "partitionField2",
"clustering": [
"field",
"field2"
]
}
}
Then retrieve your tables from Terraform
local file.
locals.tf
file :
locals {
tables = jsondecode(file("${path.module}/resource/tables.json"))["tables"]
}
I put a default partition in variables.tf
file on myDefaultDate
field :
variable "time_partitioning" {
description = "Configures time-based partitioning for this table. cf https://www.terraform.io/docs/providers/google/r/bigquery_table.html#field"
type = map(string)
default = {
type = "DAY"
field = "myDefaultDate"
}
}
In the resource.tf
file, I used a dynamic
bloc :
if the partition exists in the current table from the Json
matadata configuration file tables.json
, I take it.
Otherwise I take the default partition given by the variables.tf
file.
resource.tf file :
resource "google_bigquery_table" "tables" {
for_each = local.tables
project = var.project_id
dataset_id = each.value["dataset_id"]
table_id = each.value["table_id"]
clustering = try(each.value["clustering"], [])
dynamic "time_partitioning" {
for_each = [
var.time_partitioning
]
content {
type = try(each.value["partition_type"], time_partitioning.value["type"])
field = try(each.value["partition_field"], time_partitioning.value["field"])
expiration_ms = try(time_partitioning.value["expiration_ms"], null)
require_partition_filter = try(time_partitioning.value["require_partition_filter"], null)
}
}
schema = file("${path.module}/resource/schema/${each.value["schema_path"]}")
}
Upvotes: 2
Reputation: 18213
In this case it might be the best to use dynamic
[1] with for_each
meta-argument [2] to achieve what you want. The code would have to be changed to:
resource "google_bigquery_table" "gcp_bq_tables" {
for_each = toset(local.gcp_bq_tables)
dataset_id = google_bigquery_dataset.gcp_bq_db.dataset_id
table_id = each.value
schema = file("${local.path}/schema/${each.value}.json")
dynamic "time_partitioning" {
for_each = each.value == "table1" || each.value == "table2" ? [1] : []
content {
type = "DAY"
field = each.value == "table1" ? "my_ts_column_table1" : "my_last_modified_column_table2"
}
}
labels = {
env = var.env
app = var.app
}
}
[1] https://developer.hashicorp.com/terraform/language/expressions/dynamic-blocks
[2] https://developer.hashicorp.com/terraform/language/meta-arguments/for_each
Upvotes: 2