Reputation: 59
The file gets downloaded if the URL is entered in a browser (Firefox, Chrome, etc.). But when I tried to download the same file (using the same URL) with python's requests or urllib library, I don't get any response.
URL: https://www.nseindia.com/products/content/sec_bhavdata_full.csv (Reference Page: https://www.nseindia.com/products/content/equities/equities/eq_security.htm)
What I tried:
import requests
eqfile = requests.get('https://www.nseindia.com/products/content/sec_bhavdata_full.csv')
got no respnse. Then tried the following
temp = requests.get('https://www.nseindia.com/products/content/equities/equities/eq_security.htm')
again no response.
What would be the optimal way to download a file from such a URL (web server)?
Upvotes: 2
Views: 2530
Reputation: 143231
If I use header User-Agent
similar to header used by real web browser then I can download it.
import requests
headers = {'User-Agent': 'Mozilla/5.0'}
url = 'https://www.nseindia.com/products/content/sec_bhavdata_full.csv'
r = requests.get(url, headers=headers)
#print(r.content)
with open('sec_bhavdata_full.csv', 'wb') as fh:
fh.write(r.content)
Portals often check this header to block requests or format HTML specially for your browser/device. But requests
(and urllib.request
) send "python ..."
in this header.
Many portals needs only 'User-Agent': 'Mozilla/5.0'
to send content but other may need full header User-Agent
or even other headers like Referrer
, Accept
, Accept-Encoding
, Accept-Language
. You can see headers used by your browser on page https://httpbin.org/get
from real browser
Upvotes: 6