Reputation: 1073
How can i extract the text within this lxml starting from from div class="ember-view" id="ember760">
.
Please help. I tried the below codes but the text are not captured.
Codes i tried
#soup is an beautifulsoup element
exp = soup.find('header', {'class': 'pv-profile-section__card-header'})
exp
lxml file
<div class="pv-recommendation-entity__highlights">
<blockquote class="pv-recommendation-entity__text relative">
<div class="ember-view" id="ember760"> <span class="lt-line-clamp__line">I know Abc from Data Analysis training sessions with abc,</span>
<span class="lt-line-clamp__line">Abc
is an enthusiastic candidature in training sessions. He is an</span>
<span class="lt-line-clamp__line">extremely capable and dedicated entry-level Data Science Analyst.</span>
<span class="lt-line-clamp__line">He is enhancing Analytics skills by his enthusiasm for learning new</span>
<span class="lt-line-clamp__line lt-line-clamp__line--last">
things, and has learnt new tools like R, SPSS, and Pytho<span class="lt-line-clamp__ellipsis">...
<a aria-expanded="false" class="lt-line-clamp__more" data-test-line-clamp-show-more-button="true" href="#" id="line-clamp-show-more-button" role="button">See more</a>
</span></span>
<!-- --><span class="lt-line-clamp__ellipsis lt-line-clamp__ellipsis--dummy">... <a class="lt-line-clamp__more" href="#" role="button">See more</a></span></div>
</blockquote>
</div>
</li>
</ul>
<!-- --></div>
</div></div>
Expected output
I know Abc from Data Analysis training sessions with abc,
is an enthusiastic candidature in training sessions. He is an
extremely capable and dedicated entry-level Data Science Analyst.
He is enhancing Analytics skills by his enthusiasm for learning new
things, and has learnt new tools like R, SPSS, and Pytho
Upvotes: 2
Views: 419
Reputation: 4462
soup = BeautifulSoup(html, 'lxml')
lines = soup.select('div.ember-view > span.lt-line-clamp__line')
text = ''.join([line.find(text=True, recursive=False) for line in lines])
print(text)
Gives the text:
I know Abc from Data Analysis training sessions with abc,Abc
is an enthusiastic candidature in training sessions. He is anextremely capable and dedicated entry-level Data Science Analyst.He is enhancing Analytics skills by his enthusiasm for learning new
things, and has learnt new tools like R, SPSS, and Pytho
"See more.." will be ignored
Upvotes: 1
Reputation: 195408
You can use CSS selector div#ember760
to select <div class="ember-view" id="ember760">
and the .get_text()
method:
from bs4 import BeautifulSoup
txt = '''
<div class="pv-recommendation-entity__highlights">
<blockquote class="pv-recommendation-entity__text relative">
<div class="ember-view" id="ember760"> <span class="lt-line-clamp__line">I know Abc from Data Analysis training sessions with abc,</span>
<span class="lt-line-clamp__line">Abc
is an enthusiastic candidature in training sessions. He is an</span>
<span class="lt-line-clamp__line">extremely capable and dedicated entry-level Data Science Analyst.</span>
<span class="lt-line-clamp__line">He is enhancing Analytics skills by his enthusiasm for learning new</span>
<span class="lt-line-clamp__line lt-line-clamp__line--last">
things, and has learnt new tools like R, SPSS, and Pytho<span class="lt-line-clamp__ellipsis">...
<a aria-expanded="false" class="lt-line-clamp__more" data-test-line-clamp-show-more-button="true" href="#" id="line-clamp-show-more-button" role="button">See more</a>
</span></span>
<!-- --><span class="lt-line-clamp__ellipsis lt-line-clamp__ellipsis--dummy">... <a class="lt-line-clamp__more" href="#" role="button">See more</a></span></div>
</blockquote>
</div>
</li>
</ul>
<!-- --></div>
</div></div>'''
soup = BeautifulSoup(txt, 'lxml')
print(soup.select_one('div#ember760').get_text(strip=True, separator='\n'))
Prints:
I know Abc from Data Analysis training sessions with abc,
Abc
is an enthusiastic candidature in training sessions. He is an
extremely capable and dedicated entry-level Data Science Analyst.
He is enhancing Analytics skills by his enthusiasm for learning new
things, and has learnt new tools like R, SPSS, and Pytho
...
See more
...
See more
Upvotes: 2