Reputation: 762
I started working with Vertex AI and tried to create a custom job.
The requirements.txt
file contains:
--extra-index-url https://europe-west4-python.pkg.dev/.../europe-west4-python/simple
my_package1==1.2.3
my_package2=4.5.6
In the build log I get the following output:
Step #1 - "create job": Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com, https://europe-west4-python.pkg.dev/.../europe-west4-python/simple
Step #1 - "create job": WARNING: Compute Engine Metadata server unavailable on attempt 1 of 3. Reason: timed out
Step #1 - "create job": WARNING: Compute Engine Metadata server unavailable on attempt 2 of 3. Reason: timed out
Step #1 - "create job": WARNING: Compute Engine Metadata server unavailable on attempt 3 of 3. Reason: timed out
Step #1 - "create job": WARNING: Authentication failed using Compute Engine authentication due to unavailable metadata server.
Step #1 - "create job": WARNING: Failed to retrieve Application Default Credentials: Could not automatically determine credentials. Please set GOOGLE_APPLICATION_CREDENTIALS or explicitly create credentials and re-run the application. For more information, please see https://cloud.google.com/docs/authentication/getting-started
Step #1 - "create job": WARNING: Trying to retrieve credentials from gcloud...
Step #1 - "create job": WARNING: Could not open the configuration file: [/home/.config/gcloud/configurations/config_default].
Step #1 - "create job": ERROR: (gcloud.config.config-helper) You do not currently have an active account selected.
Step #1 - "create job": Please run:
Step #1 - "create job":
Step #1 - "create job": $ gcloud auth login
Step #1 - "create job":
Step #1 - "create job": to obtain new credentials.
Step #1 - "create job":
Step #1 - "create job": If you have already logged in with a different account:
Step #1 - "create job":
Step #1 - "create job": $ gcloud config set account ACCOUNT
Step #1 - "create job":
Step #1 - "create job": to select an already authenticated account to use.
Step #1 - "create job": WARNING: Failed to retrieve credentials from gcloud: gcloud command exited with status: Command '['gcloud', 'config', 'config-helper', '--format=json(credential)']' returned non-zero exit status 1.
Step #1 - "create job": WARNING: Artifact Registry PyPI Keyring: No credentials could be found.
Step #1 - "create job": WARNING: Keyring is skipped due to an exception: Failed to find credentials, Please run: `gcloud auth application-default login or export GOOGLE_APPLICATION_CREDENTIALS=<path/to/service/account/key>`
Step #1 - "create job": User for europe-west4-python.pkg.dev: ERROR: Exception:
Step #1 - "create job": Traceback (most recent call last):
Step #1 - "create job": File "/usr/local/lib/python3.9/dist-packages/pip/_internal/cli/base_command.py", line 160, in exc_logging_wrapper
Step #1 - "create job": status = run_func(*args)
Step #1 - "create job": File "/usr/local/lib/python3.9/dist-packages/pip/_internal/cli/req_command.py", line 247, in wrapper
Step #1 - "create job": return func(self, options, args)
Step #1 - "create job": File "/usr/local/lib/python3.9/dist-packages/pip/_internal/commands/install.py", line 400, in run
Step #1 - "create job": requirement_set = resolver.resolve(
Step #1 - "create job": File "/usr/local/lib/python3.9/dist-packages/pip/_internal/resolution/resolvelib/resolver.py", line 92, in resolve
Step #1 - "create job": result = self._result = resolver.resolve(
Step #1 - "create job": File "/usr/local/lib/python3.9/dist-packages/pip/_vendor/resolvelib/resolvers.py", line 481, in resolve
Step #1 - "create job": state = resolution.resolve(requirements, max_rounds=max_rounds)
Step #1 - "create job": File "/usr/local/lib/python3.9/dist-packages/pip/_vendor/resolvelib/resolvers.py", line 348, in resolve
Step #1 - "create job": self._add_to_criteria(self.state.criteria, r, parent=None)
Step #1 - "create job": File "/usr/local/lib/python3.9/dist-packages/pip/_vendor/resolvelib/resolvers.py", line 172, in _add_to_criteria
Step #1 - "create job": if not criterion.candidates:
Step #1 - "create job": File "/usr/local/lib/python3.9/dist-packages/pip/_vendor/resolvelib/structs.py", line 151, in __bool__
Step #1 - "create job": return bool(self._sequence)
Step #1 - "create job": File "/usr/local/lib/python3.9/dist-packages/pip/_internal/resolution/resolvelib/found_candidates.py", line 155, in __bool__
Step #1 - "create job": return any(self)
Step #1 - "create job": File "/usr/local/lib/python3.9/dist-packages/pip/_internal/resolution/resolvelib/found_candidates.py", line 143, in <genexpr>
Step #1 - "create job": return (c for c in iterator if id(c) not in self._incompatible_ids)
Step #1 - "create job": File "/usr/local/lib/python3.9/dist-packages/pip/_internal/resolution/resolvelib/found_candidates.py", line 44, in _iter_built
Step #1 - "create job": for version, func in infos:
Step #1 - "create job": File "/usr/local/lib/python3.9/dist-packages/pip/_internal/resolution/resolvelib/factory.py", line 279, in iter_index_candidate_infos
Step #1 - "create job": result = self._finder.find_best_candidate(
Step #1 - "create job": File "/usr/local/lib/python3.9/dist-packages/pip/_internal/index/package_finder.py", line 889, in find_best_candidate
Step #1 - "create job": candidates = self.find_all_candidates(project_name)
Step #1 - "create job": File "/usr/local/lib/python3.9/dist-packages/pip/_internal/index/package_finder.py", line 830, in find_all_candidates
Step #1 - "create job": page_candidates = list(page_candidates_it)
Step #1 - "create job": File "/usr/local/lib/python3.9/dist-packages/pip/_internal/index/sources.py", line 134, in page_candidates
Step #1 - "create job": yield from self._candidates_from_page(self._link)
Step #1 - "create job": File "/usr/local/lib/python3.9/dist-packages/pip/_internal/index/package_finder.py", line 790, in process_project_url
Step #1 - "create job": index_response = self._link_collector.fetch_response(project_url)
Step #1 - "create job": File "/usr/local/lib/python3.9/dist-packages/pip/_internal/index/collector.py", line 461, in fetch_response
Step #1 - "create job": return _get_index_content(location, session=self.session)
Step #1 - "create job": File "/usr/local/lib/python3.9/dist-packages/pip/_internal/index/collector.py", line 364, in _get_index_content
Step #1 - "create job": resp = _get_simple_response(url, session=session)
Step #1 - "create job": File "/usr/local/lib/python3.9/dist-packages/pip/_internal/index/collector.py", line 135, in _get_simple_response
Step #1 - "create job": resp = session.get(
Step #1 - "create job": File "/usr/local/lib/python3.9/dist-packages/pip/_vendor/requests/sessions.py", line 600, in get
Step #1 - "create job": return self.request("GET", url, **kwargs)
Step #1 - "create job": File "/usr/local/lib/python3.9/dist-packages/pip/_internal/network/session.py", line 518, in request
Step #1 - "create job": return super().request(method, url, *args, **kwargs)
Step #1 - "create job": File "/usr/local/lib/python3.9/dist-packages/pip/_vendor/requests/sessions.py", line 587, in request
Step #1 - "create job": resp = self.send(prep, **send_kwargs)
Step #1 - "create job": File "/usr/local/lib/python3.9/dist-packages/pip/_vendor/requests/sessions.py", line 708, in send
Step #1 - "create job": r = dispatch_hook("response", hooks, r, **kwargs)
Step #1 - "create job": File "/usr/local/lib/python3.9/dist-packages/pip/_vendor/requests/hooks.py", line 30, in dispatch_hook
Step #1 - "create job": _hook_data = hook(hook_data, **kwargs)
Step #1 - "create job": File "/usr/local/lib/python3.9/dist-packages/pip/_internal/network/auth.py", line 270, in handle_401
Step #1 - "create job": username, password, save = self._prompt_for_password(parsed.netloc)
Step #1 - "create job": File "/usr/local/lib/python3.9/dist-packages/pip/_internal/network/auth.py", line 233, in _prompt_for_password
Step #1 - "create job": username = ask_input(f"User for {netloc}: ")
Step #1 - "create job": File "/usr/local/lib/python3.9/dist-packages/pip/_internal/utils/misc.py", line 204, in ask_input
Step #1 - "create job": return input(message)
Step #1 - "create job": EOFError: EOF when reading a line
Step #1 - "create job": The command '/bin/sh -c pip install --no-cache-dir -r ./requirements.txt' returned a non-zero code: 2
Step #1 - "create job": ERROR: (gcloud.ai.custom-jobs.create)
Step #1 - "create job": Docker failed with error code 2.
Step #1 - "create job": Command: docker build --no-cache -t gcr.io/.../cloudai-autogenerated/...:20221212.14.42.28.274055 --rm -f- .
Step #1 - "create job":
The package keyrings.google-artifactregistry-auth
is installed.
[email protected]
and my service account specified in the build trigger have access rights to read from the artifact registry. The same I tried locally and also have the same problem from my PC.
My first understanding was that the Vertex AI containers do not have a network connection, but at least the Google homepage I can access.
However, metadata.google.internal
times out.
I tried to add network = "default"
and network = "cloudbuild"
(read both) to my °config.yaml` file creating the custom job, but still get the error.
Further I added some output via RUN
and ONBUILD RUN
to the Dockerfile
of my base image and can see that the first has the project and service account from the build trigger set, but the docker build
done by gcloud ai custom-jobs create
does not have it any more.
Is there another way than hard-coding an access key for a service account into the base image?
Upvotes: 0
Views: 1995
Reputation: 334
Basically, you need to do the following:
gcloud auth application-default login
export AUTH_TOKEN=$(gcloud auth print-access-token)
pip_index_urls
of your component function:from kfp.dsl import component
@component(
base_image="python:3.10",
packages_to_install=["my-custom-package==0.1.0"],
pip_index_urls=[
"https://oauth2accesstoken:${AUTH_TOKEN}@us-east1-python.pkg.dev/<project>/<repository>/simple", # replace <project> and <repository> accordingly
"https://pypi.python.org/simple",
],
)
def step1() -> list:
from my_custom_package import MyClass
model = MyClass()
set_env_variable
(in kfp v2; for v1, take a look here) in your pipeline definition function:import os
from kfp.dsl import pipeline
from your_component.step1 import step1
@pipeline(
name="my pipeline",
description="my custom pipeline",
)
def my_pipeline(a: int, b: int):
step1(a=a, b=b).set_env_variable(name="AUTH_TOKEN", value=os.getenv("AUTH_TOKEN"))
⚠️ Warning: If you take a look at your pipeline compiled yaml file, you'll see that AUTH_TOKEN is there as a plain text. Therefore, remember to not commit it in a public repository!
Upvotes: 0
Reputation: 6572
I don't use Vertex AI
but usually in GCP
, if you want using Python
packages from Artifact Registry
, there are 2 methods (the documentation is complete and give the different steps).
At the end you will generate a pip.conf
file contaning the extra index url
targeting on the url to Artifact
registry.
If you use the method with token key as base64
, the following command will generate the pip.conf
file for you :
gcloud artifacts print-settings python --project=PROJECT \
--repository=REPOSITORY \
--location=LOCATION --json-key=KEY-FILE
In this case, you have to follow the best practice for a Json
key.
In all the cases, at the end you have to copy the pip.conf
file in the expected place, to give to Vertex AI
the possibility to download packages from Artifact Registry
.
Upvotes: 0