avnshrai
avnshrai

Reputation: 73

Migrating a huge Bigtable database in GCP from one account to another using DataFlow

I have a huge database stored in Bigtable in GCP. I am migrating the bigtable data from one account to another GCP Account using DataFlow. but, when I created a job to create a sequence file from the bigtable it has created 3000 sequence files on the destination bucket. so, it is not possible to create a single dataflow for each 3000 sequence file so, Is there any way to reduce the sequence files or a way to provide the whole 3000 sequence files at once in a Data Flow Job template in GCP

We have two sequence file wanted to upload data sequentially one after another(10 rows and one column), but actually getting result uploaded(5 rows and 2 columns)

Upvotes: 2

Views: 235

Answers (1)

Billy Jacobson
Billy Jacobson

Reputation: 1703

The sequence files should have some sort of pattern to their naming e.g. gs://mybucket/somefolder/output-1, gs://mybucket/somefolder/output-2, gs://mybucket/somefolder/output-3 etc.

When running the Cloud Storage SequenceFile to Bigtable Dataflow template set the sourcePattern parameter to the prefix of that pattern like gs://mybucket/somefolder/output-* or gs://mybucket/somefolder/*

Upvotes: 1

Related Questions