Reading CSV header and saving with Dataflow Beam on Python

How to read the first line and store header data in Apache Beam Python?

Upvotes: 1

Views: 1035

Answers (1)

ningk
ningk

Reputation: 1383

Check out this example. See how the UsCovidDataCsvReader parses the input.

The basic ideas are

  • read the header and build a schema from it
  • read the file with skip_header_lines=1
  • parse the input with the schema to build a PCollection

Upvotes: 2

Related Questions