Forepick
Forepick

Reputation: 937

Cloud Dataflow - TextIO.Read: Return specific file URL given a match pattern

Given a match-pattern to a TextIO.Read (for instance gs://my_bucket/file-*.txt), I want to return the full URL of each and every matched file. How can I retrieve this parameter?

Thanks

Upvotes: 0

Views: 194

Answers (1)

Lara Schmidt
Lara Schmidt

Reputation: 309

Dataflow doesn't currently support anything like this.

You can use GCS utilities to grab a list of files that match a given pattern with a *.

Here is their command line tool: https://cloud.google.com/storage/docs/gsutil And some client libraries: https://cloud.google.com/storage/docs/json_api/v1/libraries#api-client-libraries

However note that if the files were written recently or change very often, GCS only guarantees eventual consistency on list operations. So you might grab a slightly different list each time. If the file list isn't changing, it should be correct.

Upvotes: 4

Related Questions