zkta
zkta

Reputation: 105

Terraform - Copy multiple Files to bucket at the same time bucket creation

Hello,

I have a little head hache.

I want to create buckets and cp bulk files at the same time. I have multiple folder (datasetname) in schema folder with json file: schema/dataset1 schema/dataset2 schema/dataset3

The trick is Terraform generate bucketname + random numbers to avoid already name used. I have one question:

How to copy bulk files in a bucket (at the same time bucket creation)


resource "google_storage_bucket" "map" {
  for_each                    = {for i, v in var.gcs_buckets: i => v} 
  name                        = "${each.value.id}_${random_id.suffix[0].hex}"
  location                    = var.default_region
  storage_class               = "REGIONAL"
  uniform_bucket_level_access = true 

  #If you destroy your bucket, this option will delete all objects inside this bucket 
  #if not Terrafom will fail that run
  force_destroy               = true 
  labels = {
    env = var.env_label
  }


resource "google_storage_bucket_object" "map" {
  for_each  = {for i, v in var.json_buckets: i => v} 
  name      =  ""
  source    = "schema/${each.value.dataset_name}/*"
  bucket = contains([each.value.bucket_name], each.value.dataset_name) 
  #bucket = "${google_storage_bucket.map[contains([each.value.bucket_name], each.value.dataset_name)]}"     
}


variable "json_buckets" {
  type = list(object({
    bucket_name    = string
    dataset_name   = string
  }))
  default = [
      {
    bucket_name      = "schema_table1",
    dataset_name     = "dataset1",
    },
      {
    bucket_name      = "schema_table2",
    dataset_name     = "dataset2",
    },
    {
    bucket_name      = "schema_table2",
    dataset_name     = "dataset3",
    },
    ]
}


variable "gcs_buckets" {
  type = list(object({
    id       = string
    description = string
  }))
  default = [
      {
    id       = "schema_table1",
    description = "schema_table1",
    },
    ]
}
...

Upvotes: 3

Views: 2926

Answers (1)

Jordan
Jordan

Reputation: 4502

Why do you have bucket = contains([each.value.bucket_name], each.value.dataset_name)? The contains function returns a bool, and bucket takes a string input (the name of the bucket).

There is no resource that will allow you to copy multiple objects at once to the bucket. If you need to do this in Terraform, you can use the fileset function to get a list of files in your directory, then use that list in your for_each for the google_storage_bucket_object. It might look something like this (untested):

locals {
  // Create a master list that has all files for all buckets
  all_files = merge([
    // Loop through each bucket/dataset combination
    for bucket_idx, bucket_data in var.json_buckets:
    {
      // For each bucket/dataset combination, get a list of all files in that dataset
      for file in fileset("schema/${bucket_data.dataset_name}/", "**"):
        // And stick it in a map of all bucket/file combinations
        "bucket-${bucket_idx}-${file}" => merge(bucket_data, {
          file_name = file
        })
    }
  ]...)
}

resource "google_storage_bucket_object" "map" {
  for_each = local.all_files
  name     = each.value.file_name
  source   = "schema/${each.value.dataset_name}/${each.value.file_name}"
  bucket   = each.value.bucket_name   
}

WARNING: Do not do this if you have a lot of files to upload. This will create a resource in the Terraform state file for each uploaded file, meaning every time you run terraform plan or terraform apply, it will do an API call to check the status of each uploaded file. It will get very slow very quickly if you have hundreds of files to upload.

If you have a ton of files to upload, consider using an external CLI-based tool to sync the local files with the remote bucket after the bucket is created. You can use a module such as this one to run external CLI commands.

Upvotes: 3

Related Questions