Reputation: 257
I've been trying to use custom made images to run my google data flow pipeline. Given the information from https://cloud.google.com/compute/docs/reference/latest/images I've tested the following code snippets:
DataflowPipelineOptions options = PipelineOptionsFactory.create().as(DataflowPipelineOptions.class);
...
options.setDiskSourceImage("ubuntu-1504-vivid-v20150911");
options.setDiskSourceImage("projects/ubuntu-os-cloud/global/images/ubuntu-1504-vivid-v20150911");
options.setDiskSourceImage("https://www.googleapis.com/compute/beta/projects/ubuntu-os-cloud/global/images/ubuntu-1504-vivid-v20150911");
all of the above tries led to the following error in my pipeline:
(b9c7b66a676906f4): Unable to create VMs. Causes: (b9c7b66a67690aef): Error: Message: Invalid value for field 'resource.disks[0].initializeParams.sourceImage': '[edited]'. Must be the URL to a Compute resource of the correct type HTTP Code: 400
Upvotes: 1
Views: 278
Reputation: 6776
Using a custom disk image with Dataflow is not a viable option. The flag diskSourceImage is deprecated and will be removed in a future SDK release. The reason it is no longer supported is because the Dataflow service relies on versioned resources in the VM image. So Dataflow needs control of the VM image so that we can upgrade it as necessary. If users supply their own custom images we have no way of keeping them in sync with the requirements of the Dataflow service.
If your custom VM image is based off a Dataflow image then you would be able to execute jobs using that custom image until the next release of a Dataflow VM image. There is no reasonable way in which you would be able to keep your custom images in sync with Dataflow's VM images so that you would be able to keep this working.
If you would like to customize the VM image please let us know why (e.g. send us an email at [email protected]) so we can either suggest an alternative solution or else consider supporting your use case in the future.
Upvotes: 1
Reputation: 3214
There's a subtle issue with setDiskSourceImage -- it uses 'beta' instead of the current 'v1' version for Compute Engine. If you try the following, it should work:
options.setDiskSourceImage("https://www.googleapis.com/compute/v1/projects/ubuntu-os-cloud/global/images/ubuntu-1504-vivid-v20150911");
Upvotes: 0