Reputation: 766
I am using PubSub to capture realtime data. Then using GCP Dataflow to stream the data into BigQuery. I am using Java for dataflow.
I want to try out the templates given in DataFlow. The process is: PubSub --> DataFlow --> BigQuery
Currently I am sending message in string format into PubSub (Using Python here). But the template in dataflow is only accepting JSON message. The python library is not allowing me to publish a JSON message. Can anyone suggest me a way publish a JSON message to PubSub so that I can use the dataflow template to do the Job.
Upvotes: 0
Views: 1038
Reputation: 444
The pipeline pumping data from PubSub to BQ provided by Google now assume JSON format and a matching schema on the other side.
Publishing JSONs to Pubsub is no different from publishing strings. You can try the following code snippets for python dict to JSON conversion:
import json
py_dict = {"name" : "Peter", "locale" : "en-US"}
json_string = json.dumps(py_dict)
If you'd like to do heavy customization to the pipeline, you can also take the source code at the following location and build your own.
Upvotes: 2