Get values using BS4

Question

I'm trying to get the "data-val" from my soup, but they all come in a huge list and not formatted in different lists/columns as show in the website.

I know the headers are here:


    
     proj. pts.
    
    
     pts.
    
   
   
    
     relegated
    
    
     rel.
    
   
   
    
     qualify for UCL
    
    
     make UCL
    
   
   
    
     win Premier League
    
    
     win league

This is what I'm trying:

url = 'https://projects.fivethirtyeight.com/soccer-predictions/premier-league/'
r = requests.get(url = url)
soup = BeautifulSoup(r.text, "html.parser")
table = soup.find("table", {"class":"forecast-table"})
#print(table.prettify())
for i in table.find_all("td", {"class":"pct"}):
     print(i)

So ideally I'd like 4 lists, with the class names and then the matching values

Danielle M. · Accepted Answer

Not entirely sure what specific cols you want but this gets all the ones with a data-val in the tag's attributes:

import requests
from bs4 import BeautifulSoup

url = 'https://projects.fivethirtyeight.com/soccer-predictions/premier-league/'
r = requests.get(url)

soup = BeautifulSoup(r.text, "html.parser")
table = soup.find("table", {"class": "forecast-table"})

team_rows = table.find_all("tr", {"class": "team-row"})

for team in team_rows:
    print("Team name: {}".format(team['data-str']))

    team_data = team.find_all("td")

    for data in team_data:
        if hasattr(data, 'attrs') and 'data-val' in data.attrs:
            print("	{}".format(data.attrs['data-val']))
    print("
")

If I do understand your question correctly, you're looking for the last couple of values, which are fairly untagged in the html source. When that's the case, you can try simply looking for tag[6], although it's of course not very robust - but this is html parsing, so "not very robust" is par for the course imho.

what I'm doing here is finding all the team rows (which is easy thanks to the class name), and then simply looping through all the td tags that are in the team rows' tr.

Get values using BS4

Answers (1)

Related Questions