Reputation: 11
I'm trying to read a csv file from GC but the read method is producing a 'typeError', below is a code snippet that I am working with:
import apache_beam as beam
from apache_beam.dataframe.convert import to_dataframe
def print_row(row):
print(row)
pipeline = beam.Pipeline()
test_pipeline = (pipeline
| "read_from_bigquery" >> beam.io.ReadFromBigQuery( table = 'gs://cloud-samples-data/bigquery/sample-transactions/transactions.csv')
| "print rows" >> beam.Map(lambda r: (print_row(r))))
pipeline.run()
Upvotes: 0
Views: 410
Reputation: 5104
For csv files you could also consider using the dataframe API, i.e.
import apache_beam as beam
from apache_beam.dataframe.convert import to_dataframe
from apache_beam.dataframe.io import read_csv
with beam.Pipeline() as p:
pcoll = to_dataframe(p | read_csv('gs://cloud-samples-data/bigquery/sample-transactions/transactions.csv'))
Upvotes: 0
Reputation: 11
Solved. Here is the required change:
"read_from_bigquery" >> beam.io.ReadFromText('gs://cloud-samples-data/bigquery/sample-transactions/transactions.csv')
Upvotes: 1