Reputation: 83
I want to extract data into div tags using BeautifulSoup :
<div class="post contentTemplate" itemprop="text">Data to extract<div class="clear"></div></div>
Upvotes: 1
Views: 59
Reputation: 9257
You can try something like this:
from bs4 import BeautifulSoup as bs
data = '<div class="post contentTemplate" itemprop="text">Data to extract<div class="clear"></div></div>'
soup = bs(data)
m = soup.findAll("div", {"class": "post contentTemplate"})
for k in m:
print(k.get_text())
Output:
Data to extract
Upvotes: 1
Reputation: 1001
you can use the get_text()
method. this will extract all text from every div
that find_all()
finds in the source code.
data = [e.get_text() for e in html.find_all('div')]
when run it returns:
[u'Data to extract', u'']
if you don't want the empty values just filter them out.
data = [e.get_text() for e in html.find_all('div') if e.get_text()]
Upvotes: 0