prog
prog

Reputation: 1073

typeerror while extracting text from beautiful soup object

I have a bs4 element called lines with span class. I am trying to take the text but i encounter a typeerror as below

lines consists:

[<span class="lt-line-clamp__line">I'm excited to be entering a new phase of my career at Xyz!</span>,
 <span class="lt-line-clamp__line"></span>,
 <span class="lt-line-clamp__line lt-line-clamp__line--last">
       I'm a program manager, product development leader, and business strategist who is passionate about delive<span class="lt-line-clamp__ellipsis">...
             <a aria-expanded="false" class="lt-line-clamp__more" data-test-line-clamp-show-more-button="true" href="#" id="line-clamp-show-more-button" role="button">see more</a>
 </span></span>]

Code:

lines = about.select('span.lt-line-clamp__line')  # this lines consists of above input
about = ''.join([line.find(text=True, recursive=False) for line in lines])

Error:

TypeError                                 Traceback (most recent call last)
<ipython-input-104-45a217b0c18f> in <module>
      1 lines = about.select('span.lt-line-clamp__line')
----> 2 about = ''.join([line.find(text=True, recursive=False) for line in lines])

TypeError: sequence item 1: expected str instance, NoneType found

 

Upvotes: 0

Views: 30

Answers (1)

Mike67
Mike67

Reputation: 11342

find probably returns None if the text is not found.

Try this code. It skips lines when the text is not found.

about = ''.join([line.find(text=True, recursive=False) for line in lines if line.find(text=True, recursive=False)])

Upvotes: 1

Related Questions