Scheduling "scraping" of data from REST APIs

Question

One of data sources I extract data from provides access through REST API in form of JSON responses. That's great, because I get data already structured, i.e., less pain with scraping and parsing unstructured HTML documents.

However, they constrain HTTP traffic with rate limiting: requests per minutes/hour/month/IP/user email.

When I was scraping HTML documents with Scrapy I could easily configure number of requests per second, delays between subsequent requests, number of threads, etc. I will call it "load strategy". The way it works in Scrapy under the hood, is that I generate a number of HTTP requests that Scrapy puts into the queue, and process requests from the queue with respect to the given "load strategy".

Is there something like that for REST APIs?

To give some context, I'm using Python REST client generated from data source Swagger definitions. The client uses urlib3 under the hood. The client provides a way to execute requests in an asynchronous way and a way to configure a thread pool but it looks like I would need to play a bit around to configure it. I'm looking for out-of-the-box solution.

Scheduling "scraping" of data from REST APIs

Answers (1)

Related Questions

Scheduling &quot;scraping&quot; of data from REST APIs

Answers (1)

Related Questions

Scheduling "scraping" of data from REST APIs