Ben
Ben

Reputation: 62336

Job has been attempted too many times or run too long

I have a job that works flawless locally, but in production I run into issues where it doesn't work. I've encompassed the entire handle() with a try/catch and am not seeing anything logged to Bugsnag, despite many other exceptions elsewhere from being deployed.

public function handle() {
    try {

        // do stuff

    } catch (\Exception $e) {
        Bugsnag::notifyException($e);

        throw $e;
    }
}

According to Laravel Horizon this queue job runs for 0.0026001930236816406 seconds and I never see it work and never see any other errors in the failed_jobs table as it relates to this job.

config/queue.php

    'redis' => [
        'driver' => 'redis',
        'connection' => 'default',
        'queue' => 'default',
        'retry_after' => (60 * 10), // 10 minutes
        'block_for' => null,
    ],

config/horizon.php

'environments' => [
    'production' => [
        'supervisor'        => [
            'connection'    => 'redis',
            'queue'         => [
                'default',
            ],
            'balance'       => 'auto',
            'processes'     => 10,
            'tries'         => 3,

            // 10 seconds under the queue's retry_after to avoid overlap
            'timeout'       => (60 * 10) - 10, // Just under 10 mins
        ],

If something is causing this job to retry over and over, how can I find out how? I'm at a loss.

Investigation thus far

SELECT DISTINCT exception, COUNT(id) as errors
FROM failed_jobs 
WHERE payload LIKE '%[TAG-JOB-HAS]%' 
GROUP BY exception;

To see more than this error message:

Job has been attempted too many times or run too long

but that's all I see.

Upvotes: 42

Views: 105018

Answers (12)

pablorsk
pablorsk

Reputation: 4276

When you encounter the error Job has been attempted too many times or run too long in Laravel, there can be multiple causes.

1. If failed() Exception is always MaxAttemptsExceededException

If the failed() method get a MaxAttemptsExceededException, the job might be running out of memory. Since this is a fatal error, no specific exception is thrown.

To handle this, simply increase the memory limit at the beginning of the job’s handle() method using ini_set.

To test this, simply run the same code on a Command. If you get something like "PHP Fatal error: Allowed memory size of 268435456 bytes exhausted", you can solve this problem with the next solution.

Solution: Increase the memory limit at the beginning of the job’s handle() method using ini_set.

public function handle()
{
    ini_set('memory_limit', '1024M'); // Increases available memory
    // Job code, like image resizing
}

2. If You Use Horizon Queue

👉 You have to set retry_after (in config/queue) and timeout (in config/horizon).

The two values ​​work together. Either value throws the exception "has been attempted too many times or run too long. The job may have previously timed out."

config/queue.php:

'redis' => [
    'driver' => 'redis',
    'connection' => 'default',
    'queue' => env('REDIS_QUEUE', 'default'),
    'retry_after' => 60 * 7, // always greater than retry_after
    'after_commit' => true, // check this if you dispatch jobs/events inside of DB transactions.
    'block_for' => null,
],
/* ... */

config/horizon.php:

'defaults' => [
    'supervisor-1' => [
        'connection' => 'redis',
        'queue' => ['default'],
        /* ... */
        'timeout' => 60 * 5, // always lower than retry_after
    ],
    /* ... */

Explanation: Think of the retry_after value as something global control, a process that every so often checks if there are any jobs still queued. While timeout is precise and is applied at the time of launching the job (It is precisely the --timeout flag of the horizon work command). Therefore, timeout should always be smaller than retry_after. And retry_after should be the maximum value that any job on that connection takes. (More on Job expiration on official documentation).

👉 Is not mandatory, but if you don't want to give high times but you have particularly long jobs, use an exclusive queue with more time (check this issue).

👉 Also, check if you don't have any infinite loops. For example, related with Model observers. Some times Model1 observer touch a Model2 and fire an observer. That Model 2 observer touch again Model 1, and Model 1 observer is fired again. You never receive specific error log about this situation, only a "has been attempted too many times..."

Summary

  1. If failed() throws MaxAttemptsExceededException: Check if the job runs out of memory and increase the limit with ini_set.
  2. If using Queue with Horizon: Set retry_after in config/queue.php and timeout in config/horizon.php, ensuring that timeout is less than retry_after.

Upvotes: 1

Hassaan Tariq
Hassaan Tariq

Reputation: 61

Try it by disabling or null compression and serializer in you database.php config

'redis' =>` ['client' => env('REDIS_CLIENT', 'phpredis'),
    'options' => [
        'cluster' => env('REDIS_CLUSTER', 'redis'),
        'prefix' => env('REDIS_PREFIX', Str::slug(env('APP_NAME', 'laravel'), '_').'_database_'),
        'password' => env('REDIS_PASSWORD', null),
        'serializer' => 0,
       'compression' => 0,
    ]
    ];

Upvotes: 2

Igor
Igor

Reputation: 1111

This is a complex issue to work with. That happens when you have a huge data to handle with or working with a slow API, slow/weak server or else, so you have to tune up several things:

  • webserver (Nginx for sure)
  • PHP-CLI
  • Laravel
  • Supervisor

You need to test execution timeout by yourself. In this case let's make 500 (seconds)

Nginx config

proxy_read_timeout directive - you can increase this value for a needle timeout one in your nginx config.

nano /etc/nginx/sites-available/mysite.com.conf

proxy_read_timeout 500

don't forget to restart nginx sudo service nginx restart

PHP-CLI config

memory_limit directive that works with RAM. You can turn off all limits for PHP-CLI scripts with value of -1

max_execution_time directive, time in seconds allowed for script execute. Optional, but you can test up increasing of this value.

nano /etc/php/8.2/cli/php.ini

memory_limit=-1

max_execution_time=500

don't forget to restart php sudo service php8.2-fpm restart

Laravel

Timeouts. Increase timeout time of your Queue Job/Task. Also you can turn off failing on timeout (not recommended)

<?php

namespace App\Jobs;

class MyJobTask implements ShouldQueue
{
    /**
     * The number of seconds the job can run before timing out.
     *
     * @var int
     */
    public $timeout = 500;

    /**
     * Indicate if the job should be marked as failed on timeout.
     *
     * @var bool
     */
    public $failOnTimeout = false; // not recommended
}

Play with retry_after option in config/queue.php. Set the value (in seconds) as maximum as possible, but less than next job call and more that timeout value. It will help with error [job] has been attempted too many times. This error also may happen if you have some breaks in your job code like dd, die, exit.

<?php
'database' => [
    'driver' => 'database',
    'table' => 'jobs',
    'queue' => 'default',
    'retry_after' => 505,
    'after_commit' => false,
    'timeout' => 500,
],

Also if you working with a huge data array/collection, use LazyCollection handling with chunks

Supervisor

Supervisor is a service to control queue processes on a server side. Install and tune up supervisor config for your Laravel project. But don't forget to provide --timeout option with a command in your config.

command=/usr/bin/php /var/www/mysite.com/artisan queue:work database --timeout=500

In your config/queue.php configuration file, each queue connection defines a retry_after option. This option specifies how many seconds the queue connection should wait before retrying a job that is being processed. For example, if the value of retry_after is set to 90, the job will be released back onto the queue if it has been processing for 90 seconds without being released or deleted. Typically, you should set the retry_after value to the maximum number of seconds your jobs should reasonably take to complete processing.

--timeout The value should always be at least several seconds shorter than your retry_after configuration value. This will ensure that a worker processing a frozen job is always terminated before the job is retried. If your --timeout option is longer than your retry_after configuration value, your jobs may be processed twice.

Upvotes: 6

Romek
Romek

Reputation: 334

I was having the same issue, but the cause was different than most answers here.

I was using a Heroku free tier Redis instance, which supports up to 30 connections, and this was not a problem, until i started using supervisor to run multiple queue workers at once.

It seems that the jobs were failing because the max connections were being reached when connecting to add data to the database and working on it at once.

I upgraded to the second tier with 256 connections and the issue is now gone.

Edit: The error was still occuring, it was because I was using the WithoutOverlapping middleware in my job, and the $key I was giving it wasn't unique due to a mistake in my code, make sure that the $key is unique if you're also using it.

Upvotes: 0

Zahit Rios
Zahit Rios

Reputation: 179

https://laravel.com/docs/9.x/queues#timeout

You can set the number of seconds allowed per each job before an exception

php artisan queue:work --timeout=300

Upvotes: 0

SAm Vice
SAm Vice

Reputation: 1

This solved my issue php artisan queue:work --timeout=600 --tries=30

Upvotes: 0

Yasin Patel
Yasin Patel

Reputation: 5721

I had the same problem

I resolved the issue by using the below code in Job class.

public $failOnTimeout = false;

It will continue if there is timeout or fail job. I also increased timeout time.

public $timeout = 120000;

Reference : https://laravel.com/docs/9.x/queues#failing-on-timeout

Upvotes: 5

Arthur Shlain
Arthur Shlain

Reputation: 1099

Perhaps this will help someone: don't use dd() in queued tasks.

Upvotes: 5

LeviZoesch
LeviZoesch

Reputation: 1621

If you've seen this MaxAttemptsExceededException in your error logs or failed_jobs table and you don't have a clue what happened to the job, let me try to explain what may have happened. It's either:

The job timed out and it can't be attempted again.
The job was released back to the queue and it can't be attempted again.

If your job processing time exceeded the timeout configuration, the worker will check the maximum attempts allowed and the expiration date for the job and decide if it can be attempted again. If that's not possible, the worker will just mark the job as failed and throw that MaxAttemptsExceededException.

Also if the job was released back to the queue and a worker picks it up, it'll first check if the maximum attempts allowed was exceeded or the job has expired and throw MaxAttemptsExceededException in that case.

https://divinglaravel.com/job-has-been-attempted-too-many-times-or-run-too-long

Upvotes: 2

a_sarana
a_sarana

Reputation: 512

According to documentation, you can handle job failing in two common ways:

  • using failed job events
  • using failed() method.

In the first case, you can handle all jobs using Queue::failing() method. You'll receive Illuminate\Queue\Events\JobFailed event as a parameter, and it contains exception.

In another case, you can use failed() method, it should be placed near your handle() method. You can receive Exception $exception as a parameter too.

Example:

public function failed(\Throwable $exception)
{
    // Log failure
}

Hope this helps.

Upvotes: 9

Anil Kumar
Anil Kumar

Reputation: 844

Try to catch the exception in the failed method given by laravel

/**
* The job failed to process.
*
* @param  Exception  $exception
* @return void
*/
public function failed(Exception $exception)
{
    // Send user notification of failure, etc...
}

and check whether your default queue driver in local is sync then its expected behavior.

Upvotes: 25

Yasser
Yasser

Reputation: 1369

I had the same problem

I fixed it by increasing the 'retry_after' parameter

make sure the retry_after value is greater than the time it takes a job to run

in config/queue.php file

    'connections' => [

    'sync' => [
        'driver' => 'sync',
    ],

    'database' => [
        'driver' => 'database',
        'table' => 'jobs',
        'queue' => 'default',
        'retry_after' => 9000,
    ],

Upvotes: 47

Related Questions