Reputation:
My example code:
from bs4 import BeautifulSoup
import re
html = """
<li> <span>EAN:</span> 1111111</li>
<li> <span>Price:</span> 3</li>
"""
soup=BeautifulSoup(html,'html.parser')
for tag in soup.find_all("li"):
print("{0}: {1}".format(tag.name, tag.text))
Output
li: EAN: 1111111<br>
li: Price: 3
Expected Output
EAN: 11111
But how extract EAN: 11111 only? string=("EAN:") not working
Upvotes: 1
Views: 265
Reputation: 25048
If you just want to print all li
that contains EAN
use soup.find_all('span',string=re.compile(r'EAN'))
:
Example
from bs4 import BeautifulSoup
import re
html = """
<li> <span>EAN:</span> 1111111</li>
<li> <span>Price:</span> 1</li>
<li> <span>EAN:</span> 2222222</li>
<li> <span>Price:</span> 2</li>
<li> <span>EAN:</span> 3333333</li>
<li> <span>Price:</span> 3</li>
"""
soup=BeautifulSoup(html,'html.parser')
for tag in soup.find_all('span',string=re.compile(r'EAN')):
print(tag.parent.text)
Output
EAN: 1111111
EAN: 2222222
EAN: 3333333
Upvotes: 1