Sahal Nurain
Sahal Nurain

Reputation: 11

apache beam method 'ReadFromBigQuery' producing TypeError: isinstance() arg 2 must be a type or tuple of types

I'm trying to read a csv file from GC but the read method is producing a 'typeError', below is a code snippet that I am working with:

import apache_beam as beam
from apache_beam.dataframe.convert import to_dataframe


def print_row(row):
    print(row)


pipeline = beam.Pipeline()


test_pipeline = (pipeline 
                  | "read_from_bigquery" >> beam.io.ReadFromBigQuery( table = 'gs://cloud-samples-data/bigquery/sample-transactions/transactions.csv')
                  | "print rows" >> beam.Map(lambda r: (print_row(r))))

pipeline.run()


Upvotes: 0

Views: 410

Answers (2)

robertwb
robertwb

Reputation: 5104

For csv files you could also consider using the dataframe API, i.e.

import apache_beam as beam
from apache_beam.dataframe.convert import to_dataframe
from apache_beam.dataframe.io import read_csv


with beam.Pipeline() as p:
  pcoll = to_dataframe(p | read_csv('gs://cloud-samples-data/bigquery/sample-transactions/transactions.csv'))

Upvotes: 0

Sahal Nurain
Sahal Nurain

Reputation: 11

Solved. Here is the required change:

 "read_from_bigquery" >> beam.io.ReadFromText('gs://cloud-samples-data/bigquery/sample-transactions/transactions.csv')

Upvotes: 1

Related Questions