GRoberts
GRoberts

Reputation: 1

Beautifulsoup returns empty for all table tags

I'm trying to access the table details to ultimately put into a dataframe and save as a csv with a limited number of rows(the dataset is massive) from the following site: https://data.cityofchicago.org/Public-Safety/Crimes-2001-to-present/ijzp-q8t2/data

I'm just starting out webscraping and was practicing on this dataset. I can effectively pull tags like div but when I try soup.findAll('tr') or td, it returns an empty set.

The table appears to be embedded in a different code(see link above) so that's maybe my issue, but still unsure how to access the detail rows and headers, etc..., Selenium maybe?

Thanks in advance!

Upvotes: 0

Views: 253

Answers (1)

felipe
felipe

Reputation: 8025

By the looks of it, the website already allows you to export the data:

Download

As it would seem, the original link is:

https://data.cityofchicago.org/Public-Safety/Crimes-2001-to-present/ijzp-q8t2/data

The .csv download link is:

https://data.cityofchicago.org/api/views/ijzp-q8t2/rows.csv?accessType=DOWNLOAD

The .json link is:

https://data.cityofchicago.org/resource/ijzp-q8t2.json

Therefore you could simply extract the ID of the data, in this case ijzp-q8t2, and replace it on the download links above. Here is the official documentation of their API.

import pandas as pd
from sodapy import Socrata

# Unauthenticated client only works with public data sets. Note 'None'
# in place of application token, and no username or password:
client = Socrata("data.cityofchicago.org", None)

# Example authenticated client (needed for non-public datasets):
# client = Socrata(data.cityofchicago.org,
#                  MyAppToken,
#                  userame="[email protected]",
#                  password="AFakePassword")

# First 2000 results, returned as JSON from API / converted to Python list of
# dictionaries by sodapy.
results = client.get("ijzp-q8t2", limit=2000)

# Convert to pandas DataFrame
results_df = pd.DataFrame.from_records(results)

Upvotes: 2

Related Questions