AnumNuma
AnumNuma

Reputation: 61

Passing positional argument to Dataproc serverless PySpark script.py

I ran the below statement, but it failed to pass an argument (--args argument="xyz") to script.py. I tried passing the argument in different ways, but the script fails with an error IndexError: list index out of range.

Could someone please help? Thanks in advance.

Command:

gcloud dataproc batches submit pyspark gs://path/script.py \
--project xxx \
--region xxx  \
--batch xxx \
--version 2.1 \
--deps-bucket='xxx' \
--staging-bucket='xxx' \
--service-account xxx  \
--subnet xxx
--args argument="xyz"

Error:

param=sys.argv[1]
               ~~~~~~~~^^^
IndexError: list index out of range

Upvotes: 2

Views: 233

Answers (1)

Arunkumar Chacko
Arunkumar Chacko

Reputation: 71

Try something like the following (basically replace --args with --. Note the space after --):

gcloud dataproc batches submit pyspark gs://path/script.py \
  --project xxx \
  --region xxx  \
  --batch xxx \
  --version 2.1 \
  --deps-bucket='xxx' \
  --staging-bucket='xxx' \
  --service-account xxx  \
  --subnet xxx \
  -- argument="xyz"

Upvotes: 3

Related Questions