Reputation: 119
website code looks like this:
<ul class="article-list">
<li>
<p class="promo">
"sentence sentence sentence sentence"
<a class="readmore" href="https://link.blahblah.com"> Read more >> </a>
</p>
</li>
</ul>
I tried
ul = soup.find_all("ul", class_= "article-list")
for elem in ul:
lis = elem.find_all("li")
for x in lis:
preview = x.find("p", class_="promo").get_text()
this returns
"sentence sentence sentence sentence Read more"
How can I return "sentence sentence sentence sentence" only without "Read more"?
Upvotes: 0
Views: 74
Reputation: 11
you could try adding to a list
soup = bs(resp, 'html.parser')
ul = soup.find_all("ul", class_= "article-list")
preview = []
for elem in ul:
lis = elem.find_all("li")
for x in lis:
preview = x.find("p", class_="promo")
preview.append(x.text)
Upvotes: 0
Reputation: 195448
You can use .find_next()
method with text=True
parameter:
data = '''<ul class="article-list">
<li>
<p class="promo">
"sentence sentence sentence sentence"
<a class="readmore" href="https://link.blahblah.com"> Read more >> </a>
</p>
</li>
</ul>'''
from bs4 import BeautifulSoup
soup = BeautifulSoup(data, 'lxml')
print(soup.select_one('p.promo').find_next(text=True))
Prints:
"sentence sentence sentence sentence"
Upvotes: 1