Reputation:
I need to get value "Anti-Mage" from class in python. How can I do it?
<td class="cell-xlarge"><a href="/players/432283612/matches?hero=anti-mage">Anti-Mage</a><div class="subtext minor"><a href="/matches/6107031786"><time data-time-ago="2021-07-26T23:27:54+00:00" datetime="2021-07-26T23:27:54+00:00" title="Mon, 26 Jul 2021 23:27:54 +0000">2021-07-26</time></a></div></td>
Upvotes: 0
Views: 201
Reputation: 25196
Based on your comment, to get all the <a>
.
import requests
from bs4 import BeautifulSoup as BS
url = 'https://www.dotabuff.com/players/432283612'
headers = {
"Accept":"*/*",
"User-Agent":"Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Firefox/91.0"
}
r = requests.get(url, headers=headers)
soup = BeautifulSoup(r.text)
[x.text for x in soup.select('article a[href*="matches?hero"]')]
['Anti-Mage', 'Shadow Fiend', 'Slark', 'Morphling', 'Tinker', 'Bristleback', 'Invoker', 'Broodmother', 'Templar Assassin', 'Monkey King']
Assuming HTML posted in your question is the BeautifulSoup
object, call text
method on the <a>
:
soup.a.text
or select more specific with class you mentioned:
soup.select_one('.cell-xlarge a').text
Note: Selecting elements by class
in some cases is only third best option, cause classes can be dynamic, are not unique, ... - Better strategies are to select by id, tag
Upvotes: 1
Reputation:
r1 = requests.get(f"https://www.dotabuff.com/players/{a}/heroes", headers = headers)
html1 = BS(r1.content, 'lxml')
for a in html1.find('td', {'class': 'cell-xlarge'}):
b = a.findChildren('a', recursive=False)
a_value = b.string
print(a_value)
Upvotes: 1
Reputation: 160
First, you'll need to select the parent item (td
in this case) from its class name. You can do something like
td = soup.find('td', {'class': 'cell-xlarge'})
and then find the a
children tags with something like this
a = td.findChildren('a', recursive=False)[0]
And this will give you the a
tag. To get its value, you use .string
like this
a_value = a.string
And that gives you the value of Anti-Mage
Upvotes: 0