selvask
selvask

Reputation: 59

How to download a file using web URL in python? Download through browser works but not through python's requests

The file gets downloaded if the URL is entered in a browser (Firefox, Chrome, etc.). But when I tried to download the same file (using the same URL) with python's requests or urllib library, I don't get any response.

URL: https://www.nseindia.com/products/content/sec_bhavdata_full.csv (Reference Page: https://www.nseindia.com/products/content/equities/equities/eq_security.htm)

What I tried:

import requests
eqfile = requests.get('https://www.nseindia.com/products/content/sec_bhavdata_full.csv')

got no respnse. Then tried the following

temp = requests.get('https://www.nseindia.com/products/content/equities/equities/eq_security.htm')

again no response.

What would be the optimal way to download a file from such a URL (web server)?

Upvotes: 2

Views: 2530

Answers (1)

furas
furas

Reputation: 143231

If I use header User-Agent similar to header used by real web browser then I can download it.

import requests

headers = {'User-Agent': 'Mozilla/5.0'}
url = 'https://www.nseindia.com/products/content/sec_bhavdata_full.csv'

r = requests.get(url, headers=headers)
#print(r.content)

with open('sec_bhavdata_full.csv', 'wb') as fh:
    fh.write(r.content)

Portals often check this header to block requests or format HTML specially for your browser/device. But requests (and urllib.request) send "python ..." in this header.

Many portals needs only 'User-Agent': 'Mozilla/5.0' to send content but other may need full header User-Agent or even other headers like Referrer, Accept, Accept-Encoding, Accept-Language. You can see headers used by your browser on page https://httpbin.org/get

from real browser

Upvotes: 6

Related Questions