Naga Vijayapuram
Naga Vijayapuram

Reputation: 931

Dataproc Custom Image: Cannot find dataproc base image with dataproc-version

I am trying to create a google dataproc custom image and facing this issue ...

$ python generate_custom_image.py --image-name 1.5.48-ubuntu18-custom --dataproc-version 1.5.48-ubuntu18 --customization-script my-customization-script.sh --zone us-central1 --gcs-bucket gs://dataproc-327519-imgs
INFO:__main__:Parsed args: Namespace(accelerator=None, base_image_family=None, base_image_uri=None, customization_script='my-customization-script.sh', dataproc_version='1.5.48-ubuntu18', disk_size=20, dry_run=False, extra_sources={}, family='dataproc-custom-image', gcs_bucket='gs://dataproc-327519-imgs', image_name='1.5.48-ubuntu18-custom', machine_type='n1-standard-1', metadata=None, network='', no_external_ip=False, no_smoke_test=False, oauth=None, project_id=None, service_account='default', shutdown_instance_timer_sec=300, storage_location=None, subnetwork='', zone='us-central1')
INFO:custom_image_utils.args_inferer:Getting Dataproc base image name...
Traceback (most recent call last):
  File "generate_custom_image.py", line 95, in <module>
    main()
  File "generate_custom_image.py", line 86, in main
    args = parse_args(sys.argv[1:])
  File "generate_custom_image.py", line 57, in parse_args
    args_inferer.infer_args(args)
  File "/home/gdataproc/custom-images/custom_image_utils/args_inferer.py", line 225, in infer_args
    _infer_base_image(args)
  File "/home/gdataproc/custom-images/custom_image_utils/args_inferer.py", line 191, in _infer_base_image
    args.dataproc_version)
  File "/home/gdataproc/custom-images/custom_image_utils/args_inferer.py", line 175, in _get_dataproc_image_path_by_version
    "Cannot find dataproc base image with dataproc-version=%s." % version)
RuntimeError: Cannot find dataproc base image with dataproc-version=1.5.48-ubuntu18.

Any idea why?

Upvotes: 2

Views: 626

Answers (1)

Dagang Wei
Dagang Wei

Reputation: 26528

The custom image script relies on the label goog-dataproc-version on images to resolve an image subminor version (e.g., 1.5.48-ubuntu18) to a specific image URI, but due to an issue in the release process, there might be delay for the label to be added to the newly released image. That's why sometimes users see the error above.

Workarounds:

  1. Pick an older subminor version from the Dataproc version page https://cloud.google.com/dataproc/docs/concepts/versioning/dataproc-release-1.5

  2. Or use --dataproc-version <minor-version> (e.g., 1.5-ubuntu18) to let the script automatically resolve to the latest available subminor version. You should be able to see which version it picked by describing your custom image with gcloud compute images describe <custom-image> and check the dataproc-version label.

Upvotes: 0

Related Questions