Reputation: 159
I'm looking for a robust way to extract both team names and market odds. Given the above code segment this would be
West Brom Man City 28/1 6/1 1/8
I should also mention that I would only need team names and market odds AFTER a given fixture id (which is located in the 'data-fixtureid' attribute).
I have tried the following xpath expression:
tree.xpath('//span[@class="ippg-Market_Truncator"]/following::div[@data-fixtureid="66705048"]//text()')
to extract the team names, which didn't result in the desired output.
I'd appreciate if someone could point me in the right direction. I don't necessarily need to use xpath for this, but could also use beautiful soup for example.
Upvotes: 1
Views: 750
Reputation: 838
This answer is different from xpath since I used find_all()
and find()
functions to achieve your desired result.
First, I look for all the rows you need with a class name podEventRow
Second, I loop through that list and looked for the team name with class ippg-Market_CompetitorName
then strip/replace unnecessary whitespaces.
Third, inside the same loop I looked for the market odds using the class name ippg-Market_Topic
and then loop through each odds to get the text inside each.
podEventRow = soup.find_all('div', class_="podEventRow")
for row in podEventRow:
team_name = row.find('div', class_="ippg-Market_CompetitorName").get_text(strip=True).replace('\t\r\n', '')
market_odds_raw = row.find_all('div', class_="ippg-Market_Topic")
market_odds = ''
for odd in market_odds_raw:
market_odds += ' - ' + odd.get_text(strip=True).replace('\t\r\n', '')
print(team_name + market_odds)
PS: I used selenium to get the complete page source since the site uses JavaScript to load the table.
Upvotes: 1