hicham kiki
hicham kiki

Reputation: 93

How to get more than 1000 search results with API Github

I want to list the most starred Github repos that were created in the last 30 days, but to display more than 1000 search, I get this error message:

{
  "message": "Only the first 1000 search results are available",
  "documentation_url": "https://developer.github.com/v3/search/"
}

Upvotes: 6

Views: 4112

Answers (3)

s.dallapalma
s.dallapalma

Reputation: 1315

zifan is right. You can create a query per day for the last 30 days; or two queries per day (one each 12 hours); and so forth. The lower the interval, the more the query calls. At the same time, the more the repositories you catch.

Below an example in Python. It runs a curl call, so you can easily translate it to different languages.

import requests
from datetime import datetime, timedelta

URL = 'https://api.github.com/search/repositories?q=is:public created:{}..{}'
HEADERS = {'Authorization': 'token <PASTE_HERE_GITHUB_ACCESS_TOKEN>'}

since = datetime.today() - timedelta(days=30)  # Since 30 days ago
until = since + timedelta(days=1)   # Until 29 days ago 

while until < datetime.today():
    day_url = URL.format(since.strftime('%Y-%m-%d'), until.strftime('%Y-%m-%d'))
    r = requests.get(day_url, headers=HEADERS)
    print(f'Repositories created between {since} and {until}: {r.json().get("total_count")}')

    # Update dates for the next search
    since = until
    until = since + timedelta(days=1)

Of course, the number of repositories might still be too large. In that case, try

  1. to use pagination;
  2. to reduce the interval SINCE..UNTIL, as well as the timedelta;
  3. to add further filters in the query, for example: exclude archived and forked repositories, get repositories with a minimum number of stars only, and so forth.

Take a look here for an example. Here is a Python tool to collect repositories from Github: https://github.com/radon-h2020/radon-repositories-collector

Upvotes: 3

zifan yan
zifan yan

Reputation: 125

Run into the same issue. I think the only way work around is to split your query into small query which returns less than 1k result. (refer to github search limit results)

Upvotes: 0

abRam
abRam

Reputation: 65

as per git hub API v3 documentation, the GitHub Search API provides up to 1,000 results for each search. https://developer.github.com/v3/search/

Upvotes: -1

Related Questions