Reputation: 301
I'm trying to webscrape the data table from this site: https://fl511.com/list/events/traffic?start=0&length=25&filters%5B0%5D%5Bi%5D=5&filters%5B0%5D%5Bs%5D=Incidents&order%5Bi%5D=8&order%5Bdir%5D=asc
But unfortunately, when I print out the table it doesn't return the tbody tag (which the information is stored in). All the other tags are shown. Is there a workaround to this?
url = Request(
url,
headers={'User-Agent': 'Mozilla/5.0'}
)
webpage = urlopen(url).read()
table = soup.find_all('table')
print(table)
Upvotes: 0
Views: 146
Reputation: 195543
The data is loaded from external source via Javascript. You can use this example how to load the data:
import json
import requests
data = {
"draw": 1,
"columns": [
{
"data": None,
"name": "",
"searchable": False,
"orderable": False,
"search": {"value": "", "regex": False},
"title": "",
"visible": True,
"isUtcDate": False,
"isCollection": False,
},
{
"data": "region",
"name": "region",
"searchable": False,
"orderable": True,
"search": {"value": "", "regex": False},
"isUtcDate": False,
"isCollection": False,
},
{
"data": "county",
"name": "county",
"searchable": False,
"orderable": True,
"search": {"value": "", "regex": False},
"isUtcDate": False,
"isCollection": False,
},
{
"data": "roadwayName",
"name": "roadwayName",
"searchable": False,
"orderable": True,
"search": {"value": "", "regex": False},
"isUtcDate": False,
"isCollection": False,
},
{
"data": "direction",
"name": "direction",
"searchable": False,
"orderable": True,
"search": {"value": "", "regex": False},
"isUtcDate": False,
"isCollection": False,
},
{
"data": "type",
"name": "type",
"searchable": False,
"orderable": True,
"search": {"value": "Incidents", "regex": False},
"isUtcDate": False,
"isCollection": False,
},
{
"data": "severity",
"name": "severity",
"searchable": False,
"orderable": True,
"search": {"value": "", "regex": False},
"isUtcDate": False,
"isCollection": False,
},
{
"data": "description",
"name": "description",
"searchable": False,
"orderable": False,
"search": {"value": "", "regex": False},
"isUtcDate": False,
"isCollection": False,
},
{
"data": "startTime",
"name": "startTime",
"searchable": False,
"orderable": True,
"search": {"value": "", "regex": False},
"isUtcDate": False,
"isCollection": False,
},
{
"data": "lastUpdated",
"name": "lastUpdated",
"searchable": False,
"orderable": True,
"search": {"value": "", "regex": False},
"isUtcDate": False,
"isCollection": False,
},
{
"data": 10,
"name": "",
"searchable": False,
"orderable": False,
"search": {"value": "", "regex": False},
"isUtcDate": False,
"isCollection": False,
},
],
"order": [{"column": 8, "dir": "asc"}],
"start": 0,
"length": 25,
"search": {"value": "", "regex": False},
}
url = "https://fl511.com/List/GetData/traffic"
data = requests.post(url, json=data).json()
# uncomment this to print all data:
# print(json.dumps(data, indent=4))
for i, d in enumerate(data["data"], 1):
print(i, d["description"])
print()
print("Records total:", data["recordsTotal"])
print("Records filtered:", data["recordsFiltered"])
Prints:
1 Crash in Highlands County on US-27 South, at Lake Josephine Dr. Right lane blocked. Last updated at 04:24 PM.
2 Emergency vehicles in Highlands County on US-27 North, at Lake Josephine Dr. Right lane blocked. Last updated at 04:25 PM.
3 Crash in Manatee County on US-41 North, at Pearl Ave. All lanes blocked. Last updated at 04:29 PM.
4 Crash in Polk County on I-4 East, beyond CR-557. 2 Left lanes blocked. Last updated at 04:32 PM.
5 Emergency vehicles in Manatee County on US-41 South, at Pearl Ave. Left lane blocked. Last updated at 04:35 PM.
6 Crash in Miami-Dade County on I-195 East, beyond North Miami Ave. Right lane blocked. Last updated at 05:03 PM.
7 Crash in Santa Rosa County on I-10 East, ramp to Exit 22 (SR-281/Avalon Blvd). Right shoulder blocked. Last updated at 05:05 PM.
8 Emergency vehicles in Santa Rosa County on I-10 West, at Exit 22 (SR-281/Avalon Blvd). Left shoulder blocked. Last updated at 05:02 PM.
9 Multi-vehicle crash in Duval County on I-295 E South, before Between Atlantic Blvd/St Johns Bluff Rd. Left shoulder blocked. Last updated at 05:30 PM.
Records total: 93
Records filtered: 9
Upvotes: 2