Reputation: 33
I want to download a CSV file which is located under the Export button on this page: https://data.cityofnewyork.us/Public-Safety/NYPD-Motor-Vehicle-Collisions/h9gi-nx95
I tried using Beautiful Soup after examining source code for segment containing Export button. However, the code below returns an empty list.
url='https://data.cityofnewyork.us/Public-Safety/NYPD-Motor-Vehicle
Collisions/h9gi-nx95'
page = requests.get(url)
soup = BeautifulSoup(page.text, 'html.parser')
domain_csv=soup.find_all('class','download-link')
print(domain_csv)
Running this returns an empty list, meaning it is not able to find it in the soup.
Does anyone have any thoughts as to how to grab a CSV which requires clicking on a link such as the one provided above?
Upvotes: 0
Views: 3508
Reputation: 195408
BeautifulSoup cannot "click" on the link of web-page. You need to observe what requests is the browser doing by clicking on that link (e.g. in Firefox developer tools). This page uses this link to download the CSV (warning, huge file!):
import requests
url = 'https://data.cityofnewyork.us/api/views/h9gi-nx95/rows.csv?accessType=DOWNLOAD'
print(requests.get(url).text)
Upvotes: 2