Reputation: 9601
I have a PCollection of matched GCS filenames, each of which contains a single compressed JSON blob. What's the best way to read the entire file, decompress it (Gzip format), and JSON decode it?
Are there any existing APIs and/or examples that can give me a head start? Seems like this would be a pretty common use case.
Upvotes: 2
Views: 1112
Reputation: 3214
This isn't natively supported in Dataflow. To accomplish reading a JSON blob out of a file, you could implement FileBasedSource:
https://cloud.google.com/dataflow/java-sdk/JavaDoc/com/google/cloud/dataflow/sdk/io/FileBasedSource
If that's enough to get started, we can continue to update this answer with more information.
Upvotes: 2