Get json data from multiple api pages into one main json output

Question

I'm trying to get the json data from every page on an API and put that into one big json output.

(Docs for API i'm using: https://docs.scoresaber.com/#/Leaderboards/get_api_leaderboards)
When doing the following API call:
https://scoresaber.com/api/leaderboards?qualified=true&withMetadata=true
i get the object metadata which has total and itemsPerPage Example:

"metadata": {
    "total": 193,
    "page": 1,
    "itemsPerPage": 14
  }

So 193/14 means i get 14 pages.

This means i can iterate through all pages by doing a request for each page with this API call: https://scoresaber.com/api/leaderboards?qualified=true&page=2 until i get to &page=4

Each page will result this json (trimmed example):

{
  "leaderboards": [
    {
      "id": 466447,
      "songHash": "E527C82AF2DEC46A23F12D742035D76CCA875904",
      "songName": "Parasite",
      "songSubName": "(feat. Hatsune Miku)",
      "songAuthorName": "DECO*27",
      "levelAuthorName": "Alice",
      "difficulty": {
        "leaderboardId": 466447,
        "difficulty": 1,
        "gameMode": "SoloStandard",
        "difficultyRaw": "_Easy_SoloStandard"
      },
      "maxScore": 0,
      "createdDate": "2022-06-01T17:16:52.000Z",
      "rankedDate": null,
      "qualifiedDate": "2022-06-14T05:53:21.000Z",
      "lovedDate": null,
      "ranked": false,
      "qualified": true,
      "loved": false,
      "maxPP": -1,
      "stars": 0,
      "plays": 70,
      "dailyPlays": 0,
      "positiveModifiers": false,
      "playerScore": null,
      "coverImage": "https://cdn.scoresaber.com/covers/E527C82AF2DEC46A23F12D742035D76CCA875904.png",
      "difficulties": null
    },
  ],
  "metadata": {
    "total": 193,
    "page": 2,
    "itemsPerPage": 14
  }
}

So what i want is to loop through all the pages and have every item in leaderboards into one json.

This is what I've tried:

import requests
import math
import json

response = requests.get("https://scoresaber.com/api/leaderboards?qualified=true&withMetadata=true")

api = json.loads(response.text)
pages = math.ceil(api['metadata']['total'] / api['metadata']['itemsPerPage'])
api = {}
for page in range(1, pages+1):
    api.update(json.loads(requests.get(f"https://scoresaber.com/api/leaderboards?qualified=true&page={page}").text))
api = json.dumps(api, indent=4)

But that seems to only get the last page and just overwrite the dictionary (i'm also not sure if i need to declare api as a dict.

So I'm just not sure what is going wrong, if im declaring stuff wrongly, if im requesting the api wrongly, or if im putting stuff wrongly into the dict, etc.

Andrej Kesely · Accepted Answer

If I understand you correctly you want to receive all data to one big list:

import json
import math
import requests

url1 = (
    "https://scoresaber.com/api/leaderboards?qualified=true&withMetadata=true"
)
url2 = "https://scoresaber.com/api/leaderboards?qualified=true&page={}"

api = requests.get(url1).json()
pages = math.ceil(api["metadata"]["total"] / api["metadata"]["itemsPerPage"])

all_data = []
for page in range(1, pages + 1):
    data = requests.get(url2.format(page)).json()
    all_data.extend(data["leaderboards"])

print(json.dumps(all_data, indent=4))

This will print all 193 items from all pages:

[
    {
        "id": 484864,
        "songHash": "80559A7A4AC0F62F27DAF1C59DF67F305250ADFF",
        "songName": "Phony",
        "songSubName": "feat. KAFU (Hoshimachi Suisei Cover)",
        "songAuthorName": "Tsumiki",
        "levelAuthorName": "Joshabi & Shad",


...

Get json data from multiple api pages into one main json output

Answers (1)

Related Questions