mircomp
mircomp

Reputation: 11

Google Dataflow Pricing Streaming Mode

I'm new to Dataflow. I'd like to use the Dataflow streaming template "Pub/Sub Subscription to BigQuery" to transfer some messages, say 10000 per day. My question is about pricing since I don't understand how they're computed for the streaming mode, with Streaming Engine enabled or not. I've used the Google Calculator which asks for the following:
Machine Type, Number of worker nodes used by the job, If streaming or Batch job, Number of GB of Persistent Disks (PD), Hours the job runs per month.

Consider the easiest case, since I don't need many resources, i.e.

Case 1: Streaming Engine DISABLED

So I will pay:

Case 2 Streaming Engine ENABLED.

So I will pay:

Considering messages of 1024 Byte, we have a traffic of 1024 x 10000 x 30 Bytes = 0.307 GB, and an extra cost of 0.307 GB x $0.018 = $0.005 (almost zero).

Actually, with this kind of traffic, I will save about $15 in using Streaming Engine. Am I correct? Is there something else to consider or something wrong with my assumptions and my calculations? Additionally, considering the low amount of data, is Dataflow really fitted for this kind of use? Or should I approach this problem in a different way?

Thank you in advance!

Upvotes: 1

Views: 3580

Answers (1)

guillaume blaquiere
guillaume blaquiere

Reputation: 75735

It's not false, but not perfectly accurate.

In the streaming mode, your Dataflow always listen the PubSub subscription and thus you need to but up full time.

In batch processing, you normally start the batch, it performs its job and then it stops.

In your comparison, you consider to have a batch job that runs full time. It's not impossible, but it doesn't fit your use case, I think.


About streaming and batching, all depends on your need of real time.

  • If you want to ingest the data in BigQuery with low latency (in few seconds) to have real time data, streaming is the good choice
  • If having data only updated every hour or every day, batch is a more suitable solution.

A latest remark, if your task is only to get message from PubSub and to stream write to BigQuery, you can consider to code it yourselves on Cloud Run or Cloud Functions. With only 10k messages per day, it will be free!

Upvotes: 0

Related Questions