hamdog
hamdog

Reputation: 1181

Python Apache Beam convert list of dicts to multiple PCollection elements

As part of my dataflow I have a CombineFn that returns a list of dicts. I want to print each dict to an avro file as a record. However, when I apply beam.io.WriteToAvro to my CombineFn output, it fails.

It seems like the full list of dicts is being treated as a single element. Is there any way I can get it to treat it like a list of elements?

Upvotes: 1

Views: 2167

Answers (1)

hamdog
hamdog

Reputation: 1181

Hopefully there's a better way to do this, but I was able to break the list up by applying the following DoFn:

class BreakList(beam.DoFn):
    def process(self, element):
        for e in element:
            yield e

Upvotes: 2

Related Questions