Carbonara83
Carbonara83

Reputation: 11

beam dataflow python name 'PipelineOptions' is not defined

I want to create very simple pipeline and already get stuck at the beginning. Here comes my code:

import apache_beam as beam
options = PipelineOptions()
google_cloud_options = options.view_as(GoogleCloudOptions)
google_cloud_options.project = 'myproject'
google_cloud_options.job_name = 'mypipe'
google_cloud_options.staging_location = 'gs://mybucket/staging'
google_cloud_options.temp_location = 'gs://mybucket/temp'
options.view_as(StandardOptions).runner = 'DataflowRunner'

Produces the Error:

NameError: name 'PipelineOptions' is not defined

Upvotes: 1

Views: 2807

Answers (3)

Geoffroy de Viaris
Geoffroy de Viaris

Reputation: 381

The module code has changed from apache_beam.utils to apache_beam.option :

You should now use:

from apache_beam.options.pipeline_options import PipelineOptions
from apache_beam.options.pipeline_options import SetupOptions
from apache_beam.options.pipeline_options import GoogleCloudOptions
from apache_beam.options.pipeline_options import StandardOptions

Official documentation here : https://beam.apache.org/releases/pydoc/2.0.0/_modules/apache_beam/options/pipeline_options.html

Upvotes: 1

kfpanda
kfpanda

Reputation: 1

from apache_beam.pipeline import PipelineOptions
options = PipelineOptions()

Upvotes: 0

mcseare
mcseare

Reputation: 3

You will need to add some additional imports for the example to work:

from apache_beam.io import ReadFromText
from apache_beam.io import WriteToText
from apache_beam.metrics import Metrics
from apache_beam.utils.pipeline_options import PipelineOptions
from apache_beam.utils.pipeline_options import SetupOptions
from apache_beam.utils.pipeline_options import GoogleCloudOptions
from apache_beam.utils.pipeline_options import StandardOptions

Upvotes: 0

Related Questions