elaspog
elaspog

Reputation: 1709

Load local CSV file into BigQuery table with Terraform?

I'm new to terraform. Is it possible to load the content of a CSV file into a BigQuery table without uploading it to GCS?

I've studied the document below, but the solution doesn't seem to work on local files: https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/bigquery_job

Question: Is it possible somehow to do this without uploading the file into Google's environment?

resource "google_bigquery_table" "my_tyable" {
  dataset_id = google_bigquery_dataset.bq_config_dataset.dataset_id
  table_id   = "my_tyable"
  schema = file("${path.cwd}/path/to/schema.json")
}

resource "google_bigquery_job" "load_data" {
  job_id     = "load_data"

  load {
    source_uris = [
      #"gs://cloud-samples-data/bigquery/us-states/us-states-by-date.csv", # this would work
      "${path.cwd}/path/to/data.csv", # this is not working
    ]

    destination_table {
      project_id = google_bigquery_table.my_tyable.project
      dataset_id = google_bigquery_table.my_tyable.dataset_id
      table_id   = google_bigquery_table.my_tyable.table_id
    }

    skip_leading_rows = 0
    schema_update_options = ["ALLOW_FIELD_RELAXATION", "ALLOW_FIELD_ADDITION"]

    write_disposition = "WRITE_APPEND"
    autodetect = true
  }
}

Upvotes: 0

Views: 2447

Answers (3)

Stephen
Stephen

Reputation: 313

As far as I know there is no direct option to do this; however, this could be done by first uploading the csv file to an external source (like a gcs bucket) and then referencing that source in the block where you define the bigquery table.

example:

resource "google_storage_bucket_object" "orders" {
   name         = "order_items.csv"
   content_type = "csv"
   source       = "data/order_items.csv"
   bucket       = google_storage_bucket.default.id
 }

 resource "google_bigquery_table" "orders" {
   dataset_id          = google_bigquery_dataset.bq_dataset.dataset_id
   table_id            = "orders"
   deletion_protection = false
   labels = {
     env = "default"
   }

   external_data_configuration {
     autodetect = true
     source_uris =["gs://${google_storage_bucket.default.name}/${google_storage_bucket_object.orders.name}"]
     source_format = "CSV"
   }
   depends_on = [google_storage_bucket_object.orders]
 }

Upvotes: 1

LostHisMind
LostHisMind

Reputation: 378

I was trying this in my own project and I don't think it is possible based on the error message I am seeing:

│ Error: Error creating Job: googleapi: Error 400: Source URI must be a Google Cloud Storage location: [REDACTED].csv, invalid
│
│   with module.[REDACTED].google_bigquery_job.load_data,
│   on modules\[REDACTED]\main.tf line 73, in resource "google_bigquery_job" "load_data":
│   73: resource "google_bigquery_job" "load_data" {
│

Ended up putting the CSV file into the same bucket as the Terraform state with prefix data/

Upvotes: 2

Victor Biga
Victor Biga

Reputation: 519

Probably best option is to load it using file function

file("${path.module}/data.csv")

Upvotes: -1

Related Questions