Krupesh Pandya
Krupesh Pandya

Reputation: 1

Issue in Pdf download using request module in python

import requests

pdf_url = "https://www.alexandrina.sa.gov.au/__data/assets/pdf_file/0028/1619614/Council-Special-Meeting-Agenda-11-June-2024.pdf"
pdf_path = 'Test.pdf'
response = requests.get(pdf_url)
pdf_content = response.content 

with open(pdf_path, 'wb') as pdf_file:
    pdf_file.write(pdf_content)

using this code not able to download pdf because haivng 403 response but when i open it mannualy on chrome it opens and also download in my locals but when i use request module im not able to download or if use any proxy or scrape do it download but it got currupted so i cant access this pdf, can you please help what should i do?

Upvotes: 0

Views: 30

Answers (1)

wenbo
wenbo

Reputation: 1506

Seems no issue in your code. I have just changed to another pdf url, it works well.

import os
import requests

save_dir = os.getcwd()
file_name = 'test.pdf'

#url = 'https://www.alexandrina.sa.gov.au/__data/assets/pdf_file/0028/1619614/Council-Special-Meeting-Agenda-11-June-2024.pdf'

url2 = 'https://bitcoin.org/bitcoin.pdf'


outfile = os.path.join(save_dir, file_name)
response = requests.get(url2, stream=True)
with open(outfile,'wb') as output:
  output.write(response.content)

As someone mentioned here. The pdf source server block downloading using code which can prevent bots.

PDF the web server is providing you with a web page intended to prevent bots from downloading data from the site.

Upvotes: 0

Related Questions