anupam
anupam

Reputation: 357

GCP DataFlow vs CloudFunctions for small size and less frequent update

If I have a csv which is to update entries in SQL database. The file size is max 50 KB and update frequency is twice a week. Also if I have a requirement to do some automated sanity testing

What should I use Dataflow or Cloud Function.

Upvotes: 3

Views: 737

Answers (2)

Israel Herraiz
Israel Herraiz

Reputation: 656

If there are no aggregations, and each input "element" does not interact with others, that is, the pipeline works in 1:1, then you can use either Cloud Functions or Dataflow.

But if you need to do aggregations, filtering, any complex calculation that involves more than one single element in isolation, you will not be able to implement that with Cloud Functions. You need to use Dataflow in that situation.

Upvotes: 1

Daniel Amaral
Daniel Amaral

Reputation: 126

For this use case (2 little files per week) Cloud Function will be the best option. Other options would be to create a Dataflow batch pipeline and trigger by cloud function OR create a Dataflow streaming pipeline, but both options will be more expensive.

Here some documentation about how to connect on Cloud SQL from Cloud Function

And here some documentation related to triggering a Cloud Function from Cloud storage.

See ya.

Upvotes: 1

Related Questions