MattK
MattK

Reputation: 51

how to get error if starting compute engine instance fails

I am starting an instance using PHP using this code:

function startInstance($g_project,$g_instance, $g_zone){

    $client = new Google_Client();
    $client->setApplicationName('Google-ComputeSample/0.1');
    $client->useApplicationDefaultCredentials();
    $client->addScope('https://www.googleapis.com/auth/cloud-platform');

    $service = new Google_Service_Compute($client);
    $response = $service->instances->start($g_project, $g_zone, $g_instance);
    echo json_encode($response);

}

Today I was lucky enough to realize that for unknown reason the instance I wanted to start failed to do so. I tried starting it using GUI and got an error via GUI: Zone "some-zone" does not have enough resources available to fulfill the request. Try a different zone, or try again later.

I echoed out the PHP response and compared it to the one I get when an instance start successfully. My findings are shocking. The responses were exactly the same (not counting timestamps and ids). How on earth can I differentiate between failed instance starts and successful, if the response is the same?

https://cloud.google.com/compute/docs/reference/rest/v1/instances/start suggests that there will be an error object present in case of error. I can confirm that there is none.

Response of both failed an successful start:

{
    "clientOperationId": null,
    "creationTimestamp": null,
    "description": null,
    "endTime": null,
    "httpErrorMessage": null,
    "httpErrorStatusCode": null,
    "id": "id",
    "insertTime": "2019-01-28T14:22:36.664-08:00",
    "kind": "compute#operation",
    "name": "operation-name",
    "operationType": "start",
    "progress": 0,
    "region": null,
    "selfLink": "link/operation-name",
    "startTime": null,
    "status": "PENDING",
    "statusMessage": null,
    "targetId": "targetIdHere",
    "targetLink": "linkhere",
    "user": "user",
    "zone": "zone-in-question"
}

What do you suggest that I do? Switching to different zone is probably the best solution. But there is one problem, I don't even that the instance didn't start successfully so I can't react to it. Is this the expected behavior? What did you do mitigate this problem?

Upvotes: 5

Views: 576

Answers (2)

mgphys
mgphys

Reputation: 241

The response you get from the start call is just a (zone) operation resource corresponding to your asynchronous API call.

In order to determine the final status of your API call you'll need to poll that operation with a get call until its status is DONE. Then, if error field is not empty it will contain the details of went wrong, for example:

{
  ...
  "error": {
    "errors": [
      {
        "code": "ZONE_RESOURCE_POOL_EXHAUSTED",
        "message": "The zone 'projects/you-project/zones/us-central1-c' does not have enough resources available to fulfill the request."
      }
    ]
  }
  ...
}

otherwise the operation has been completed successfully.

Upvotes: 1

Daniel Härter
Daniel Härter

Reputation: 180

I actually didn't observe the error you described using GCE yet, but to get the "error state" of a GCE instance, you could query the Compute API with Method: instances.get and evaluate the response for "status" and "statusMessage"

HTTP request
GET https://www.googleapis.com/compute/v1/projects/{project}/zones/{zone}/instances/{resourceId}

The return values for status may be one of the following: PROVISIONING, STAGING, RUNNING, STOPPING, STOPPED, SUSPENDING, SUSPENDED, and TERMINATED.

See also the reference manual for this API Call: https://cloud.google.com/compute/docs/reference/rest/v1/instances/get

So if you query the status of your newly created GCE instance for some time, and only return with "success" if the status of the instance switched from "PROVISIONING" or "STAGING" to "RUNNING", you should be safe. I never observed that there were any errors during instance creation, if the instance status was set to "RUNNING".

Upvotes: 0

Related Questions