user17763427
user17763427

Reputation:

Extract value from class using BeautifulSoup

I need to get value "Anti-Mage" from class in python. How can I do it?

<td class="cell-xlarge"><a href="/players/432283612/matches?hero=anti-mage">Anti-Mage</a><div class="subtext minor"><a href="/matches/6107031786"><time data-time-ago="2021-07-26T23:27:54+00:00" datetime="2021-07-26T23:27:54+00:00" title="Mon, 26 Jul 2021 23:27:54 +0000">2021-07-26</time></a></div></td>

Upvotes: 0

Views: 201

Answers (3)

HedgeHog
HedgeHog

Reputation: 25196

EDIT

Based on your comment, to get all the <a>.

import requests
from bs4 import BeautifulSoup as BS

url = 'https://www.dotabuff.com/players/432283612'
headers = {
    "Accept":"*/*",
    "User-Agent":"Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Firefox/91.0"
}
r = requests.get(url, headers=headers)
soup = BeautifulSoup(r.text)

[x.text for x in soup.select('article a[href*="matches?hero"]')]

Output

['Anti-Mage', 'Shadow Fiend', 'Slark', 'Morphling', 'Tinker', 'Bristleback', 'Invoker', 'Broodmother', 'Templar Assassin', 'Monkey King']

Assuming HTML posted in your question is the BeautifulSoup object, call text method on the <a>:

soup.a.text

or select more specific with class you mentioned:

soup.select_one('.cell-xlarge a').text

Note: Selecting elements by class in some cases is only third best option, cause classes can be dynamic, are not unique, ... - Better strategies are to select by id, tag

Upvotes: 1

user17763427
user17763427

Reputation:

r1 = requests.get(f"https://www.dotabuff.com/players/{a}/heroes", headers = headers)

html1 = BS(r1.content, 'lxml')


for a in html1.find('td', {'class': 'cell-xlarge'}):
    b = a.findChildren('a', recursive=False)
    a_value = b.string



    print(a_value)

Upvotes: 1

shasherazi
shasherazi

Reputation: 160

First, you'll need to select the parent item (td in this case) from its class name. You can do something like

td = soup.find('td', {'class': 'cell-xlarge'})

and then find the a children tags with something like this

a = td.findChildren('a', recursive=False)[0]

And this will give you the a tag. To get its value, you use .string like this

a_value = a.string

And that gives you the value of Anti-Mage

Upvotes: 0

Related Questions