Joris
Joris

Reputation: 77

Python requests module GET method: handling pagination token in params containing %

I am trying to handle an API response with pagination. The first page provides a pagination token to reach the next one, but when I try to feed this back into the params parameter of the requests.get method it seems to slightly encode the token in the wrong way.

My attempt to retrieve the next page (using the response output of the first requests.get method):

# Initial request
response = requests.get(url=url, headers=headers, params=params)

params.update({"paginationToken": response.json()["paginationToken"]})

# Next page
response = requests.get(url=url, headers=headers, params=params)

This fails with status 500: Internal Server Error and message Padding is invalid and cannot be removed.

An example pagination token: gyuqfh%2bqyNrV9SI1%2bXulE6MXxJgb1VmOu68eH4YZ6dWUgRItb7yJPnO9bcEXdwg6gnYStBuiFhuMxILSB2gpZCLb2UjRE0pp9RkDdIP226M%3d

The url attribute of response seems to show a slightly different token if you look carefully, especially around the '%' signs: https://www.wikiart.org/en/Api/2/DictionariesByGroup?group=1&paginationToken=gyuqfh%252bqyNrV9SI1%252bXulE6MXxJgb1VmOu68eH4YZ6dWUgRItb7yJPnO9bcEXdwg6gnYStBuiFhuMxILSB2gpZCLb2UjRE0pp9RkDdIP226M%253d

For example, the pagination token and url end differently: 226M%3d and 226M%253d. When I manually copy the first part of the url and add in the correct pagination token it does retrieve the information in a browser.

Am I missing some kind of encoding I should apply to the request.get parameters before feeding them back into a new request?

Upvotes: 0

Views: 673

Answers (1)

Dan-Dev
Dan-Dev

Reputation: 9430

You are right it is some form of encoding, percentage encoding to be precise. It is frequently used to encode URLs. It is easy to decode:

from urllib.parse import unquote

pagination_token="gyuqfh%252bqyNrV9SI1%252bXulE6MXxJgb1VmOu68eH4YZ6dWUgRItb7yJPnO9bcEXdwg6gnYStBuiFhuMxILSB2gpZCLb2UjRE0pp9RkDdIP226M%253d"
pagination_token = unquote(pagination_token)
print(pagination_token)

Outputs:

gyuqfh%2bqyNrV9SI1%2bXulE6MXxJgb1VmOu68eH4YZ6dWUgRItb7yJPnO9bcEXdwg6gnYStBuiFhuMxILSB2gpZCLb2UjRE0pp9RkDdIP226M%3d

But I expect that is half your problem, use a requests session object https://requests.readthedocs.io/en/master/user/advanced/#session-objects to make the requests as there is most likely a cookie which will be sent with the request to be used in conjunction with the pagination token. I can not tell for sure as the website is currently down.

Upvotes: 1

Related Questions