hello543
hello543

Reputation: 137

Python BeautifulSoup extract Class Text only if it contains specific text

is there a way to extract the below class if the whole class text = New

 <li class="ClassifiedDetail">New

tried:

doc.find('li', class_ = 'ClassifiedDetail').attrs['New']

maybe something like if class text = New or contains 'New', take it?

Upvotes: 1

Views: 1432

Answers (1)

HedgeHog
HedgeHog

Reputation: 25196

Note It is not that clear if you mean class or tag, so I assume you mean the text of a tag

One approach could be use of css selectors and :-soup-contains():

soup.select('li.ClassifiedDetail:-soup-contains("New")')

Alternativ is using string=re.compile(), cause stringor in former versionstext` works only for exact matches of full string:

soup.find_all('li', class_ = 'ClassifiedDetail',text=re.compile('New'))

Example

from bs4 import BeautifulSoup

html='''
<li class="ClassifiedDetail">New</li>
<li class="ClassifiedDetail">New York</li>
<li class="ClassifiedDetail">Ne </li>
<li class="ClassifiedDetail">Old</li>
<li class="ClassifiedDetail">knew</li>
'''

soup = BeautifulSoup(html)
for li in soup.select('li.ClassifiedDetail:-soup-contains("New")'):
    print(li.text)

Output

New
New York

Upvotes: 1

Related Questions