Reputation: 1709
I'm new to terraform. Is it possible to load the content of a CSV file into a BigQuery table without uploading it to GCS?
I've studied the document below, but the solution doesn't seem to work on local files: https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/bigquery_job
Question: Is it possible somehow to do this without uploading the file into Google's environment?
resource "google_bigquery_table" "my_tyable" {
dataset_id = google_bigquery_dataset.bq_config_dataset.dataset_id
table_id = "my_tyable"
schema = file("${path.cwd}/path/to/schema.json")
}
resource "google_bigquery_job" "load_data" {
job_id = "load_data"
load {
source_uris = [
#"gs://cloud-samples-data/bigquery/us-states/us-states-by-date.csv", # this would work
"${path.cwd}/path/to/data.csv", # this is not working
]
destination_table {
project_id = google_bigquery_table.my_tyable.project
dataset_id = google_bigquery_table.my_tyable.dataset_id
table_id = google_bigquery_table.my_tyable.table_id
}
skip_leading_rows = 0
schema_update_options = ["ALLOW_FIELD_RELAXATION", "ALLOW_FIELD_ADDITION"]
write_disposition = "WRITE_APPEND"
autodetect = true
}
}
Upvotes: 0
Views: 2447
Reputation: 313
As far as I know there is no direct option to do this; however, this could be done by first uploading the csv file to an external source (like a gcs bucket) and then referencing that source in the block where you define the bigquery table.
example:
resource "google_storage_bucket_object" "orders" {
name = "order_items.csv"
content_type = "csv"
source = "data/order_items.csv"
bucket = google_storage_bucket.default.id
}
resource "google_bigquery_table" "orders" {
dataset_id = google_bigquery_dataset.bq_dataset.dataset_id
table_id = "orders"
deletion_protection = false
labels = {
env = "default"
}
external_data_configuration {
autodetect = true
source_uris =["gs://${google_storage_bucket.default.name}/${google_storage_bucket_object.orders.name}"]
source_format = "CSV"
}
depends_on = [google_storage_bucket_object.orders]
}
Upvotes: 1
Reputation: 378
I was trying this in my own project and I don't think it is possible based on the error message I am seeing:
│ Error: Error creating Job: googleapi: Error 400: Source URI must be a Google Cloud Storage location: [REDACTED].csv, invalid
│
│ with module.[REDACTED].google_bigquery_job.load_data,
│ on modules\[REDACTED]\main.tf line 73, in resource "google_bigquery_job" "load_data":
│ 73: resource "google_bigquery_job" "load_data" {
│
Ended up putting the CSV file into the same bucket as the Terraform state with prefix data/
Upvotes: 2
Reputation: 519
Probably best option is to load it using file function
file("${path.module}/data.csv")
Upvotes: -1