Jared Eitnier
Jared Eitnier

Reputation: 7152

How to throttle API requests using PHP

We plan to use the SEMrush API, which allows access to SEO data relating to domain names and search keywords. Under their Terms of Use, they limit their usage to avoid killing their servers:

You may not perform more than 10 requests per second, nor more than 2 simultaneous requests.

We are going to be building a simple tool in PHP that aggregates data based on a domain name and are looking for the basics on how to fulfill that requirement. We are planning for hundreds/thousands of potential simultaneous users.

Maybe someone can provide some pseudo code in PHP that would let us do this - or is it really just as simple as forcing the actual API request function to sleep for 1 second in between each command? I don't have a lot of experience with APIs and large amounts of concurrent users so any help is appreciated.

Upvotes: 3

Views: 8317

Answers (2)

Jay
Jay

Reputation: 323

Some APIs return rate limit information in the response header. Check out: Examples of HTTP API Rate Limiting HTTP Response headers This information will help you wait for a few nanoseconds, before continuing with your next request using PHP's time_nanosleep()

Some PHP libraries go pretty in-depth with their ways of rate-limiting. The Bucket Token Algorithm is pretty common across the web: https://github.com/bandwidth-throttle/token-bucket

Now I find this a bit overkill when it comes down to throttling some URL requests that don't have something like X-RateLimit-Remaining in their return header. API requests in general are usually pretty slow. So I've built the PHP script below.

This PHP script will just wait for a few milliseconds based on a $throttlerID. Higher requestsInSeconds will result in shorter wait times... If the same $throttlerID is used across simultaneous requests, each request will wait for the other using File-Locking (FLOCK()).

    function Throttler($requestsInSeconds, $throttlerID) {

        // Use FLOCK() to create a system global lock (it's crash-safe:))
        $fp = fopen(sys_get_temp_dir()."/$throttlerID", "w+");

        // exclusive lock will blocking wait until obtained
        if (flock($fp, LOCK_EX)) { 

             // Sleep for a while (requestsInSeconds should be 1 or higher)
             $time_to_sleep = 999999999 / $requestsInSeconds; 
             time_nanosleep(0, $time_to_sleep);
    
             flock($fp, LOCK_UN); // unlock
         }

        fclose($fp);

    }

Put the call to Throttler() right before each CURL call. That's it!

Upvotes: 1

Sherif
Sherif

Reputation: 11943

PHP is really not the best language to use for concurrent programming. However, there are some third party solutions that you can use along-side of PHP that can help you achieve your goals.

What you need is a job-manager or a queue system that can handle the actual requests for you. Since this is a back-end tool (at least that's what I gathered from your question) it doesn't require PHP to handle the actual control over the jobs themselves, but just have some controlling process schedule these individual jobs and hand them to your PHP scripts so that you can effectively impose these limits.

My first suggestion would be to try something like gearman, which is a great job manager and has an extension in PHP to help you interface with the library.

Another suggestion is to take a look at queue systems like amqp or zmq, some of which also have extensions in PHP.

So here's an example scenario for you...

You have a PHP script that accepts these requests and hands them off to your job manager or queue over a socket. The job manager or queue will store the request and distribute it off to the individual workers in an a way that can be centralized and controlled to impose these limits. There are some examples from the links I gave you that can help you get there. However, doing it purely in PHP without the aid of these tools will prove quite tricky and could wind up in some very edge-case buggy behavior if not carefully crafted and considered.

Upvotes: 2

Related Questions