Reputation: 2793
I am porting custom job training from gcp AI Platform to Vertex AI. I am able to start a job, but can't find how to to get the status and how to stream the logs to my local client.
For AI Platform I was using this to get the state:
from google.oauth2 import service_account
from googleapiclient import discovery
scopes = ['https://www.googleapis.com/auth/cloud-platform']
credentials = service_account.Credentials.from_service_account_file(keyFile, scopes=scopes)
ml_apis = discovery.build("ml","v1", credentials=credentials, cache_discovery=False)
x = ml_apis.projects().jobs().get(name="projects/%myproject%/jobs/"+job_id).execute() # execute http request
return x['state']
And this to stream the logs:
cmd = 'gcloud ai-platform jobs stream-logs ' + job_id
This does not work for Vertex AI job. What is the replacement code?
Upvotes: 1
Views: 537
Reputation: 119
If you're using the google-cloud-aiplatform
SDK, you can get the model training status via job.state
Example:
from google.cloud import aiplatform
# Initialize aiplatform
aiplatform.init(project="xxx", location="asia-east1", staging_bucket="xx_gcs_bucket")
# Create custom container train job
job = aiplatform.CustomContainerTrainingJob(
display_name="abc-model",
container_uri="asia-east1-docker.pkg.dev/xxx/xxx",
)
# Start the training
model = job.submit(
replica_count=1,
machine_type="n1-standard-16"
)
print(job.state)
# <PipelineState.PIPELINE_STATE_PENDING: 3>
Upvotes: 0
Reputation: 1377
Can you try this command for streaming logs :
gcloud ai custom-jobs stream-logs 123 --region=europe-west4
123 is the ID of the custom job for this case, you can add glcoud wide flags such as --format as well.
You can visit this link for more details about this command and additional flags available.
Upvotes: 2