snow_fall
snow_fall

Reputation: 225

How to iterate through pages in an API query with no page indication?

I've made an api call using this JSON query:

import requests 
import json
import pandas as pd

url = ("https://api.meetup.com/2/groups?zip=b1+1aa&offset=0&format=json&lon=-1.89999997616&category_id=34&photo-host=public&page=500&radius=200.0&fields=&lat=52.4799995422&order=id&desc=false&sig_id=243750775&sig=ed49065d620a34c10e1f0f91dd58da2e36547af1")

data = requests.get(url).json()
df = pd.io.json.json_normalize(data['results'])

So that becomes one dataframe, however, I have 5 more url pages to query which look like this:

url2 = ("https://api.meetup.com/2/groups?zip=b1+1aa&offset=1&format=json&lon=-1.89999997616&category_id=34&photo-host=public&page=500&radius=200.0&fields=&lat=52.4799995422&order=id&desc=false&sig_id=243750775&sig=ed49065d620a34c10e1f0f91dd58da2e36547af1")

and url3 is similar just a changing the pages via offset=2 etc is the key.

I want to know if I can use a for loop to iterate through all these pages.

Upvotes: 1

Views: 3042

Answers (2)

Daniel Roseman
Daniel Roseman

Reputation: 599580

The Meetup version 2 API responds with a meta dictionary that contains a next key, you should use that.

url = '...'
while url:
    data = requests.get(url).json()
    ... do something with data ...
    url = data['meta'].get('next')

Upvotes: 3

bruno desthuilliers
bruno desthuilliers

Reputation: 77892

First do not hardcode the querystring in the url but pass query data to request as a dict, ie:

url = "https://api.meetup.com/2/groups"
querydict = {
   "zip":"b1+1aa",
   "offset": 0,
   "format":"json",
   "lon":-1.89999997616,
   "category_id": 34,
   "photo-host":"public",
   # etc
   }

response = requests.get(url, params=querydict)

Then all you have to do is to loop until you have all the contents you want, updating querydict["offset"] on each iteration:

url = "https://api.meetup.com/2/groups"
querydict = {
   "zip":"b1+1aa",
   "offset": 0,
   "format":"json",
   "lon":-1.89999997616,
   "category_id": 34,
   "photo-host":"public",
   # etc
   }

while True: 
    response = requests.get(url, params=querydict)
    # check your response status, check the json data
    # etc
    if we_have_enough(response):
        break
    # ok let's fetch next page
    querydict["offset"] += 1

Upvotes: 4

Related Questions