Reputation: 33
I am using google cloud to do a testing, I follow the guide to run test against BigQuery . https://cloud.google.com/solutions/using-cloud-dataflow-for-batch-predictions-with-tensorflow
when I run the script:
python prediction/run.py \
--runner DataflowRunner \
--project $PROJECT \
--staging_location $BUCKET/staging \
--temp_location $BUCKET/temp \
--job_name $PROJECT-prediction-bq \
--setup_file prediction/setup.py \
--model $BUCKET/model \
--source bq \
--input $PROJECT:mnist.images \
--output $PROJECT:mnist.predict
It shows
Traceback (most recent call last):
File "prediction/run.py", line 23, in <module>
predict.run()
File "/home/ahuoo_com/dataflow-prediction-example/prediction/modules/predict.py", line 98, in run
images = p | 'ReadFromBQ' >> beam.Read(beam.io.BigQuerySource(known_args.input))
**AttributeError: 'module' object has no attribute 'Read'**
It looks like the apache_beam package doesn't contains the attribute 'Read'. I think the example google provided in github may be wrong. You can take a look at the code at line 98.
Is there anyone using this guide to do a test?
Upvotes: 0
Views: 2428
Reputation: 11797
You are right, there's a small mistake in the code. In line 98
where it says:
images = p | 'ReadFromBQ' >> beam.Read(beam.io.BigQuerySource(known_args.input))
It should be:
images = p | 'ReadFromBQ' >> beam.io.Read(beam.io.BigQuerySource(known_args.input))
Also, at line 100
where it says:
predictions | 'WriteToBQ' >> beam.Write(beam.io.BigQuerySink(...))
it should also be like:
predictions | 'WriteToBQ' >> beam.io.Write(beam.io.BigQuerySink(...))
The PCollection Reading/ Writing resources comes from the io
module and not apache_beam
itself.
Upvotes: 2