Reputation: 11
I have HTML code as: " 1.
<a href="/title/tt0111161/?ref_=adv_li_tt">The Shawshank Redemption</a>
<span class="lister-item-year text-muted unbold">(1994)</span>
"
How do I extract the "The Shawshank Redemption" from 'a' tag using Beautiful soup?
Upvotes: 1
Views: 126
Reputation:
A simple search would have given you
from bs4 import BeautifulSoup
data = '''
<a href="/title/tt0111161/?ref_=adv_li_tt">The Shawshank Redemption</a>
<span class="lister-item-year text-muted unbold">(1994)</span>
'''
soup = BeautifulSoup(data, 'html.parser')
print(soup.a.text)
print(soup.find('a').text)
for a in soup.find_all('a'):
print(a.text)
print(soup.a.get_text())
print(soup.find('a').get_text())
for a in soup.find_all('a'):
print(a.get_text())
Upvotes: 1
Reputation: 10826
Something like this would work:
import requests
from bs4 import BeautifulSoup
import csv
st = r"""<a href="/title/tt0111161/?ref_=adv_li_tt">The Shawshank Redemption</a>
<span class="lister-item-year text-muted unbold">(1994)</span>"""
soup = BeautifulSoup(st, 'html5lib')
a = soup.find_all('a')
a[0].text
Upvotes: 0