Reputation: 15008
As it says all. Is there anyway to search entire DOM for a specific text, for instance CAPTCHA word?
Upvotes: 1
Views: 1857
Reputation: 474171
You can use find
and specify the text
argument:
With text you can search for strings instead of tags. As with name and the keyword arguments, you can pass in a string, a regular expression, a list, a function, or the value True.
>>> from bs4 import BeautifulSoup
>>> data = """
... <div>test1</div>
... <div class="myclass1">test2</div>
... <div class="myclass2">CAPTCHA</div>
... <div class="myclass3">test3</div>"""
>>> soup = BeautifulSoup(data)
>>> soup.find(text='CAPTCHA').parent
<div class="myclass2">CAPTCHA</div>
If CAPTCHA
is just a part of a text, you can pass a lambda
function into text
and check if CAPTCHA
is inside the tag text:
>>> data = """
... <div>test1</div>
... <div class="myclass1">test2</div>
... <div class="myclass2">Here CAPTCHA is a part of a sentence</div>
... <div class="myclass3">test3</div>"""
>>> soup = BeautifulSoup(data)
>>> soup.find(text=lambda x: 'CAPTCHA' in x).parent
<div class="myclass2">Here CAPTCHA is a part of a sentence</div>
Or, the same can be achieved if you pass a regular expression into text
:
>>> import re
>>> soup.find(text=re.compile('CAPTCHA')).parent
<div class="myclass2">Here CAPTCHA is a part of a sentence</div>
Upvotes: 5