akauppi
akauppi

Reputation: 18066

How to consume changes to Google Cloud Datastore as a stream?

The Cloud Dataflow page implicates that this would be possible, but I haven't found a way of observing change events in the Google Cloud Datastore docs. How is it done?

enter image description here

Upvotes: 4

Views: 1242

Answers (1)

dsesto
dsesto

Reputation: 8178

As far as I am aware, the integration of Cloud Datastore with Dataflow is through DatastoreIO (now based on DatastoreV1), which can only be used as a bounded source for batch jobs.

I have been trying to find an alternative solution that would allow you to use Datastore (directly or indirectly) as an unbounded source (for instance creating a Pub/Sub topic where Datastore changes are published and can be consumed from Dataflow), but I do not think that would be a viable solution given that, as you said, there is no easy way to detect changes (addition of entities, modification of entities, etc.) in Datastore.

For now, I have filed an internal request to improve the documentation to either modify the image so that it does not imply that Cloud Datastore can be used with a Streaming Pipeline, or clarify this use case.

Upvotes: 2

Related Questions