drew wood
drew wood

Reputation: 335

Scraping Dynamic Webpages w/ a date selector

I am looking to use the requests module in python to scrape:

https://www.lines.com/betting/nba/odds

This site contains historical betting odds data.

The main issues, is there is a date selector on this page, and i can not seem to find where the date value is stored. Ive tried looking in the headers and the cookies, and still cant seem to find where date is stored, in order to programmatically change it, to scrape data from different dates.

Looking on the network tab, it seems like it is pulling this data from:

https://www.lines.com/betting/nba/odds/best-line?date=2023-01-23'

However, even with using the headers, i am unable to access this site. It just returns the data from:

https://www.lines.com/betting/nba/odds

which is the current date.

I am looking to do so without using a different method (i.e. Selenium) which seems pretty straight forward (Open Page -> Download Data -> Click Previous Date -> Repeat)

Here is my code to do so:

import requests
url = 'https://www.lines.com/betting/nba/odds/'
requests.get(url).text

Thanks!

Upvotes: 1

Views: 113

Answers (1)

Andrej Kesely
Andrej Kesely

Reputation: 195448

Try to pass headers={"X-Requested-With": "XMLHttpRequest"} to the request:

import requests
import pandas as pd
from itertools import cycle
from bs4 import BeautifulSoup

url = "https://www.lines.com/betting/nba/odds/best-line?date=2023-01-28"

soup = BeautifulSoup(
    requests.get(url, headers={"X-Requested-With": "XMLHttpRequest"}).content,
    "html.parser",
)

odds = []

for o in soup.select(".odds-list-col"):
    matches = [t["title"] for t in o.select(".odds-list-team")]
    teams = cycle(matches)

    for od in o.select(".odds-list-val"):
        odds.append(
            [
                next(teams),
                " vs ".join(matches),
                od.find_previous(class_="odds-col-title").text.strip(),
                od.get_text(strip=True, separator=" "),
            ]
        )

df = pd.DataFrame(odds, columns=["Team", "Match", "Odd", "Value"]).pivot(
    index=["Team", "Match"], columns="Odd", values="Value"
)
print(df)

Prints:

Odd                                      M/L            O/U          P/S
Team         Match                                                      
Bucks        Bucks vs Pacers            -325  o237.0 (-110)  -7.5 (-115)
Cavaliers    Cavaliers vs Thunder       -115              —  -1.0 (-110)
Grizzlies    Grizzlies vs Timberwolves  -150  o237.0 (-110)  -3.0 (-110)
Heat         Magic vs Heat              -275  u218.0 (-110)  -7.0 (-110)
Magic        Magic vs Heat              +265  o218.0 (-110)  +7.5 (-110)
Pacers       Bucks vs Pacers            +280  u237.5 (-110)  +8.0 (-108)
Raptors      Raptors vs Warriors        +188              —  +5.5 (-110)
Thunder      Cavaliers vs Thunder       -105              —  +1.0 (-110)
Timberwolves Grizzlies vs Timberwolves  +145  u237.5 (-110)  +3.5 (-110)
Warriors     Raptors vs Warriors        -205              —  -5.0 (-114)

Upvotes: 1

Related Questions