DG.Finance
DG.Finance

Reputation: 85

Python: How do I scrape ESPN for game matchups

Fairly new to programming, apologies is the question is broad.

import requests
from bs4 import BeautifulSoup

def data():
    League = ['nba', 'nfl', 'mlb']
    url = f"http://www.espn.com/{League[0]}/schedule"
    response = requests.get(url)
    soup = BeautifulSoup(response.content, "html.parser")
    date = soup.find('h2',{'class':'table-caption'})
    return date.string

This is what I have so far, but what I'd like to be able to do is scrape the matchups for this day so it comes out as:

>>> 'Los Angeles Lakers at Charlotte Bobcats 7:00PM'
>>> 'Boston Celtics at Detroit Pistons 7:00PM'

I see all of the information is in there, but I don't know how iterate through the rows, and to pull the data when it's not specifically as a string. I understand this is broad, and a lot to ask for. Sorry in advance!

Upvotes: 1

Views: 1197

Answers (1)

alecxe
alecxe

Reputation: 474003

I think SO community is sometimes too harsh to beginners.

Here is a way for you to locate the table results and extract home and away team names:

for row in soup.select("table.schedule tbody tr"):
    home_team, away_team = row.select(".team-name")

    print(home_team.get_text(), away_team.get_text())

The idea here is to use a CSS selector to locate table rows, iterate over every row and get the two elements with team-name class.


Overall the process of coming up with something like this is relatively straightforward:

  • inspect the desired element in browser developer tools
  • think about the things you can use to find this element - something that uniquely identifies this element (e.g. look at that super explicit team-name class)
  • write (in this case) Python/BeautifulSoup code to try and locate this element
  • reiterate until works

That's just high level but hope it helps.

Upvotes: 2

Related Questions