Alberto Cantu
Alberto Cantu

Reputation: 71

Google App Engine PHP55 random server crashes (500´s) with error code 204

Our team have been developing this RESTful API with Slim PHP as router and Propel ORM with MySQL, deploying it as a service in Google App Engine with this app.yaml config

service: api
runtime: php55
api_version: 1
threadsafe: yes

instance_class: F1
automatic_scaling:
  min_idle_instances: automatic
  max_idle_instances: automatic
  min_pending_latency: automatic
  max_pending_latency: automatic

skip_files:
- ^vendor/(.*/)+[Tt]ests/.*$
- ^\.(.*)

handlers:
- url: .*
script: app.php

to be consumed by an Ember.js web app, through all the development we've been receiving strange patternless server crashes 500s, more precisely:

500 Server Error Error: Server Error The server encountered an error and could not complete your request. Please try again in 30 seconds.

with App Engine Log.

A problem was encountered with the process that handled this request, causing it to exit. This is likely to cause a new process to be used for the next request to your application. (Error code 204)

in random endpoints, that otherwise works fine 99% of the time, we, of course, don't feel like going into production with these random crashes.

What we have tried:

  1. Checking whether the MySQL max_connections is being reached because we open and close a connection every request.
  2. Upgrading our instances from F1 for the sake of testing to F4_1G to tackle the possibility that we might be running out of memory.
  3. Stress tests in localhost with dev_appserver.py, (we don't get any crashes here)
  4. Try catch the whole Slim App for debugging (which it actually never catches the exception so it leads us to think it has really something to do with Google App Engine)

Here is some code of the normal request flow.

app.php

/*
 * Create SLIM application
 */
$app = new \Slim\App([
    "settings"  => [
        "determineRouteBeforeAppMiddleware" => true,
    ]
]);

//Custom Authentication middleware
    $app->add(new \OAuth2Server\SlimAuthenticationMiddleware());

//CORS and Headers Middleware
    $app->add(function($request, $response, $next) {

        $response = $next($request, $response);

        $response = $response->withHeader("Access-Control-Allow-Origin", "*");
        $response = $response->withHeader("Access-Control-Allow-Headers", "Content-Type, authorization");
        $response = $response->withHeader("Access-Control-Allow-Methods", "POST, GET, PATCH, DELETE, OPTIONS");
        $response = $response->withHeader("content-type", "application/json; charset=utf8");

        return $response;

    });


    require_once("router.php");

    $app->run();

router.php

$app->get($apiVersionPath.'/visits/{id}','\Controllers\Visits:get')
    ->add(new \OAuth2Server\ValidateRequestUser(array("doctor", "nurse","superuser","admin")));

Visits Controller GET/ID relevant code.

 /**
     * @param Request $request
     * @param Response $response
     * @param []$args
     * @return Response
     */
    public function get($request, $response, $args) {

        $id = $request->getAttribute("route")->getArgument("id");

        $serializer = new Serializer();

        if(!is_numeric($id) || $id == 0){
                    throw new InvalidArgumentException("Invalid Argument");
        }

         $visit = \VisitQuery::create()
                  ->useUserQuery()
                   ->filterByClientId($request->getAttribute("user")->getClientId())
                  ->endUse();

         $visit = $visit->findPk($id);

         if(!isset($visit) || !($visit instanceof \Visit)){
                    throw new EntityNotFoundException("Visit not found");
         }

        $resource = $visit->toResource(false);

        $serializer->addResource($resource);

        $body = $response->getBody();
        $body->write($serializer->serialize());
        return $response;

}

Upvotes: 1

Views: 204

Answers (2)

Alberto Cantu
Alberto Cantu

Reputation: 71

This was caused by threadsafe: yes, set it to no/false.

Upvotes: 1

dwelling
dwelling

Reputation: 491

We run an API service on a PHP Flex Engine and were noticing a similar issue when using automatic scaling. To fix it, we had to bump up the instance class (try going to F2) AND always have at least two instances running by setting min_idle_instances to 2.

We have also run into this same issue on the Standard version of the App Engine when using the task queue and basic scaling. It doesn't look like you are doing that yet but if so, the only solution we found was to enable retries in queue.yaml and set the 'Fail Fast' header when adding tasks to the Push Queue:

$pushTask = new PushTask($handler,
  array('message_data' => json_encode($message)),
  array('header'=> 'X-AppEngine-FailFast:true'));

Otherwise, groups of tasks would fail with the 204 error.

One thing that interests me about your issue is that it looks like you are trying to make an HTTP Request. In all our testing, one of the few ways we were able to reproduce the error was by dropping hundreds of tasks on a queue and having each of them run the following code:

$memCache = new Memcache;
$memCache->set($_SERVER['HTTP_X_APPENGINE_TASKNAME'] . '_1', 'Test 1');
ch = curl_init();
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, 'POST');
curl_setopt($ch, CURLOPT_POSTFIELDS, 'hi');
curl_setopt($ch, CURLOPT_URL, 'https://www.google.com');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
set($_SERVER['HTTP_X_APPENGINE_TASKNAME'] . '_' . $index, $message);
$response = curl_exec($ch);
$memCache->set($_SERVER['HTTP_X_APPENGINE_TASKNAME'] . '_2', 'Test 2');

Anytime we got the error, we were able to find the first message by its key {TASK_NAME}_1 in Memcache but we could never find the second entry {TASK_NAME}_2. And like you said, no exceptions are caught because the entire script dies.

This behavior made us believe there might be an issue with Google's implementation of Curl since we are using the full-fledged version:

extension = php_curl.dll

But we don't have a definitive answer. The only solution for us has been to increase our Instance Count and rely on retries to accomplish our code.

Hopefully one of the solutions above works for you, if you get a chance, can you show us what is in your PHP.ini file?

Upvotes: 2

Related Questions