Reputation: 164
I'm trying to download a csv file from a webpage using urllib package. To download the csv file from that site it is necessary to send a post requests with appropriate parameters.
When I try using requests module, I can download the file flawlessly. However, when I try doing the same using urllib package, I also get a csv file but this time the file only contains headers. The body is missing.
Here is how to download that file manually from that site:
Site address: https://www.nyiso.com/custom-reports?report=dam_lbmp_zonal
Zones: CAPITL, CENTRL
Version: Latest
Format: CSV
Hit `Generate Report` button
The following script only download the headers within the csv file:
import csv
import urllib.request
import urllib.parse
link = "http://dss.nyiso.com/dss_oasis/PublicReports"
params = {
'reportKey': 'DAM_LBMP_ZONE',
'startDate': '04/17/2021',
'endDate': '04/17/2021',
'version': 'L',
'dataFormat': 'CSV',
'filter': ['CAPITL','CENTRL'],
}
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.150 Safari/537.36'
}
data = urllib.parse.urlencode(params).encode()
req = urllib.request.Request(link, data=data, headers=headers)
res = urllib.request.urlopen(req)
with open("output.csv","wb") as f:
f.write(res.read())
How can I download a csv file using urllib package from a website?
Upvotes: 0
Views: 428
Reputation: 410
A small modification in your code as when you are passing list in filter parameter you need to pass doseq=True
in urlencode method while passing your params to properly encode the data.
See below code for reference.
import urllib.request
import urllib.parse
link = "http://dss.nyiso.com/dss_oasis/PublicReports"
params = {
'reportKey': 'DAM_LBMP_ZONE',
'startDate': '04/17/2021',
'endDate': '04/17/2021',
'version': 'L',
'dataFormat': 'CSV',
'filter': ['CAPITL','CENTRL'],
}
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.150 Safari/537.36'
}
data = urllib.parse.urlencode(params,doseq=True).encode()
req = urllib.request.Request(link, data=data, headers=headers)
res = urllib.request.urlopen(req)
with open("output.csv","wb") as f:
f.write(res.read())
Only small modification was needed in the urlencode
line.
Let me know if you have any questions :)
Upvotes: 5