Reputation: 137
When I use a link in a browser, it gets redirected, but when I use it in a script, it stays the same. I'm trying to find any way to grab the redirected link using the original one. My main goal here is to download a pdf file. The pdf file shows up when the url gets redirected.
To be specific, I want to use this link https://shapeyourcity.ca/32362/widgets/132002/documents/87994
within the script below to capture the redirected link so that I can later use the redirected link to download the pdf file.
I've tried with:
import requests
link = 'https://shapeyourcity.ca/32362/widgets/132002/documents/87994'
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.0.0 Safari/537.36',
}
with requests.Session() as s:
s.headers.update(headers)
res = s.get(link,allow_redirects=True)
print(res.url)
When I run the script, I end up getting the same link. However, it gets redirected in the browser.
Question: How can I grab a redirected link using requests module?
Upvotes: 0
Views: 32
Reputation: 195438
Add /download
to the URL:
import requests
url = "https://shapeyourcity.ca/32362/widgets/132002/documents/87994"
with open("file.pdf", "wb") as f_out:
resp = requests.get(url + "/download")
print("Downloading from", resp.url)
f_out.write(resp.content)
Prints:
Downloading from https://ehq-production-canada.s3.ca-central-1.amazonaws.com/e1c3b8b12b4a1e1604726ae8994703200c063dad/original/1661977908/12da2a93dcb772a56178c9b34d4bc429_Notification_postcard.pdf?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAIBJCUKKD4ZO4WUUA%2F20220903%2Fca-central-1%2Fs3%2Faws4_request&X-Amz-Date=20220903T130651Z&X-Amz-Expires=300&X-Amz-SignedHeaders=host&X-Amz-Signature=f81b11379be56a2201bc70ac8c4dbb8f541d4a1247a233f81be74f4d27b1b2e5
and saves file.pdf
(screenshot from Firefox):
Upvotes: 2