SteadyGrow99
SteadyGrow99

Reputation: 51

How do I iterate through and append the data from multiple pages with an API request?

I'm using an Indeed API from Rapid API to collect job data. The code snippet provided only returns results for 1 page. I was wondering how to set up a for loop to iterate through multiple pages and append the results together.

url = "https://indeed11.p.rapidapi.com/"


payload = {
    "search_terms": "data visualization",
    "location": "New York City, NY",
    "page": 1,
    "fetch_full_text": "yes"
}

headers = {
    "content-type": "application/json",
    "X-RapidAPI-Key": "{api key here}", # insert here,
    "X-RapidAPI-Host": "indeed11.p.rapidapi.com"
}

response = requests.request("POST", url, json=payload, headers=headers)

As seen in the code above, the key "page" is set to a value of 1. How would I parameterize this value, and how would I construct the for loop while appending the results from each page?

Upvotes: 0

Views: 1443

Answers (3)

Exabyte Miner 256
Exabyte Miner 256

Reputation: 11

I think that you could do this with a while loop. To implement this, you would need code to detect when there are no more pages to read, but it's probably possible. Here's what I would do:

url = "https://indeed11.p.rapidapi.com/"

payload = {
    "search_terms": "data visualization",
    "location": "New York City, NY",
    "page": 1,
    "fetch_full_text": "yes"
}

headers = {
    "content-type": "application/json",
    "X-RapidAPI-Key": "{api key here}", # insert here,
    "X-RapidAPI-Host": "indeed11.p.rapidapi.com"
}

responses = []
while not no_more_pages(): # no_more_pages() is a placeholder for code that detects when there are no more pages to read
    responses.append(requests.request("POST", url, json=payload, headers=headers))
    payload['page'] += 1

Once the loop is done, you could use the responses list to access the data.

Upvotes: 1

Victor Villacorta
Victor Villacorta

Reputation: 617

You can try this:

max_page = 100
result = {}
for i in range(1, max_page + 1):
    try:
        payload.update({'page': i})
        
        if i not in result:
            result[i] = requests.request("POST", url, json=payload, headers=headers)
            
    except:
        continue

Upvotes: 1

Md. Fazlul Hoque
Md. Fazlul Hoque

Reputation: 16187

You can make the pagination with the help of payload along with for loop and range function

import requests

url = "https://indeed11.p.rapidapi.com/"

payload = {
    "search_terms": "data visualization",
    "location": "New York City, NY",
    "page": 1,
    "fetch_full_text": "yes"
}

headers = {
    "content-type": "application/json",
    "X-RapidAPI-Key": "{api key here}", # insert here,
    "X-RapidAPI-Host": "indeed11.p.rapidapi.com"
}
for page in range(1,11):
    payload['page'] = page

    response = requests.post(url, json=payload, headers=headers)

Upvotes: 2

Related Questions