Reputation: 77
I am trying to handle an API response with pagination. The first page provides a pagination token to reach the next one, but when I try to feed this back into the params
parameter of the requests.get
method it seems to slightly encode the token in the wrong way.
My attempt to retrieve the next page (using the response
output of the first requests.get
method):
# Initial request
response = requests.get(url=url, headers=headers, params=params)
params.update({"paginationToken": response.json()["paginationToken"]})
# Next page
response = requests.get(url=url, headers=headers, params=params)
This fails with status 500: Internal Server Error and message Padding is invalid and cannot be removed.
An example pagination token:
gyuqfh%2bqyNrV9SI1%2bXulE6MXxJgb1VmOu68eH4YZ6dWUgRItb7yJPnO9bcEXdwg6gnYStBuiFhuMxILSB2gpZCLb2UjRE0pp9RkDdIP226M%3d
The url
attribute of response
seems to show a slightly different token if you look carefully, especially around the '%' signs:
https://www.wikiart.org/en/Api/2/DictionariesByGroup?group=1&paginationToken=gyuqfh%252bqyNrV9SI1%252bXulE6MXxJgb1VmOu68eH4YZ6dWUgRItb7yJPnO9bcEXdwg6gnYStBuiFhuMxILSB2gpZCLb2UjRE0pp9RkDdIP226M%253d
For example, the pagination token and url end differently: 226M%3d
and 226M%253d
. When I manually copy the first part of the url and add in the correct pagination token it does retrieve the information in a browser.
Am I missing some kind of encoding I should apply to the request.get
parameters before feeding them back into a new request?
Upvotes: 0
Views: 673
Reputation: 9430
You are right it is some form of encoding, percentage encoding to be precise. It is frequently used to encode URLs. It is easy to decode:
from urllib.parse import unquote
pagination_token="gyuqfh%252bqyNrV9SI1%252bXulE6MXxJgb1VmOu68eH4YZ6dWUgRItb7yJPnO9bcEXdwg6gnYStBuiFhuMxILSB2gpZCLb2UjRE0pp9RkDdIP226M%253d"
pagination_token = unquote(pagination_token)
print(pagination_token)
Outputs:
gyuqfh%2bqyNrV9SI1%2bXulE6MXxJgb1VmOu68eH4YZ6dWUgRItb7yJPnO9bcEXdwg6gnYStBuiFhuMxILSB2gpZCLb2UjRE0pp9RkDdIP226M%3d
But I expect that is half your problem, use a requests session object https://requests.readthedocs.io/en/master/user/advanced/#session-objects to make the requests as there is most likely a cookie which will be sent with the request to be used in conjunction with the pagination token. I can not tell for sure as the website is currently down.
Upvotes: 1