Reputation: 135
I am trying to extract the innerHTML from a tag using the following code:
theurl = "http://na.op.gg/summoner/userName=Darshan"
thepage = urlopen(theurl)
soup = BeautifulSoup(thepage,"html.parser")
rank = soup.findAll('span',{"class":"tierRank"})
However I am getting [< span class="tierRank" > Master < /span >]
instead.
What I want to show is the value "Master" only.
Using soup.get_text
instead of soup.findall
doesn't work.
I tried adding .text
and .string
to the end of last line but that did not work either.
Upvotes: 12
Views: 11142
Reputation: 1468
if you want as a bulk you can use the following
from bs4 import BeautifulSoup
soup = BeautifulSoup(open("C:\\test.html"), "html.parser")
for data1 in soup.find_all('td', {'class' : 'YourClass'}):
print(data1.decode_contents(), sep="\n")
Upvotes: 2
Reputation: 11
Use .decode_contents() if you want innerHTML (with html tags) use .text if you want innerText (no html tags)
Upvotes: 1
Reputation: 5313
soup.findAll('span',{"class":"tierRank"})
returns a list of elements that match <span class="tierRank">
.
innerHtml
from that element, which can be accessed by the decode_contents()
method.All together:
rank = soup.findAll('span',{"class":"tierRank"})[0].decode_contents()
This will store "Master" in rank
.
Upvotes: 17