Reputation: 3
I would like to retrieve all records (total 50,000) from an API endpoint. The endpoint only returns a maximum of 1000 records per page. Here's the function to get the records.
def get_products(token,page_number):
url = "https://example.com/manager/nexus?page={}&limit={}".format(page_number,1000)
header = {
"Authorization": "Bearer {}".format(token)
}
response = requests.get(url, headers=header)
product_results = response.json()
total_list = []
for result in product_results['Data']:
date = result['date']
price = result['price']
name = result['name']
total_list.append((date,price,name))
columns = ['date', 'price', 'name']
df = pd.DataFrame(total_list, columns=columns)
results = json.dumps(total_list)
return df, results
How can I loop through each page until the final record without hardcoding the page numbers? Currently, I'm hardcoding the page numbers as below for the first 2 pages to get 2000 records as a test.
for page_number in np.arange(1,3):
token = get_token()
product_df,product_json = get_products(token,page_number)
if page_number==1:
product_all=product_df
else:
product_all=pd.concat([product_all,product_df])
print(product_all)
Thank you.
Upvotes: 0
Views: 1056
Reputation: 182
It depends on how your backend method: json GET's return. page and limit are required. you may rewrite the json return all data. in stead of just every 1000.
num = int(50000/1000);
for i in range(1, num):
token = get_token()
product_df,product_json = get_products(token, i)
if i==1:
product_all=product_df
else:
product_all=pd.concat([product_all,product_df])
print(product_all)
Upvotes: 0
Reputation: 1003
I don't know about the behavior of the endpoint. Assuming when the page number is greater than the last page number, you would get an empty list instead. If that is the case, you could just check if the result is empty.
page_number = 1
token = get_token()
product_df, product_json = get_products(token,page_number)
product_all=product_df
while product_df.size:
page_number = page_number + 1
token = get_token()
product_df,product_json = get_products(token,page_number)
product_all=pd.concat([product_all,product_df])
print(product_all)
If you are sure there are 1000 records max per page, you could check if the result count is less than 1000 and stop the loop.
Upvotes: 1