user2353003
user2353003

Reputation: 552

Read in game stats table from MLB website into Beautiful soup

I'm trying to scrape/read in the Game Stats table from an MLB player website (https://www.mlb.com/player/charlie-morton-450203?stats=gamelogs-r-pitching-mlb&year=2019). I cannot seem to find/capture the class name. I can see the class name when I "inspect the HTML" in chrome, but beautiful soup does not seem to findit.

Is there some workaround/trick to getting this in correctly?

from bs4 import BeautifulSoup
import requests

page = requests.get('https://www.mlb.com/player/charlie-morton-450203?stats=gamelogs-r-pitching-mlb&year=2019')

soup = BeautifulSoup(page.text, "html.parser")
body = soup.find('body')

table = body.findAll('div', {'class':'gamelogs-table'})
print(table)

Upvotes: 0

Views: 569

Answers (2)

GZ0
GZ0

Reputation: 4273

If you just want to retrieve the data, I would suggest you look for existing APIs like this before attempting to scrape the website. Scrapers are susceptible to website layout changes.

This is a reddit forum that you may be interested in.

Upvotes: 0

Andrej Kesely
Andrej Kesely

Reputation: 195543

The data is loaded through AJAX. For the correct source of data you need to find URL through e.g. developer console in Firefox. This script prints the JSON data of player 450203:

import requests
import json

url = 'https://statsapi.mlb.com/api/v1/people/450203/stats?stats=gameLog'
data = requests.get(url).json()

print(json.dumps(data, indent=4))

Upvotes: 2

Related Questions