Getting a specific text from html using BeautifulSoup

Question

I have this .html code:


            
                
                    
                            1.89 s
                        I need to get this text

I need to get only the text that is outside all of the other tags (text is: I need to get this text).

I was trying to use this piece of code:

path = document.find('li', class_='level top').find_all("em")[-1].next_sibling
if not path:
    path = document.find('li', class_='level top failed open').find_all("em")[-1].next_sibling
return path

But I get an error: AttributeError: 'NoneType' object has no attribute 'find_all'.

Does anybody know how to access this text?

Thank you!

Md. Fazlul Hoque · Accepted Answer

You can apply .contents and it will generate a list of output and the desired one is [-1]

html = '''

 
  
   
    
     
      1.89 s
     
    
    I need to get this text
   
  
 


'''

from bs4 import BeautifulSoup
soup=BeautifulSoup(html,'html.parser')
#print(soup.prettify())

txt= soup.select_one('#tree > li > span').contents[-1]
print(txt)

Output:

  I need to get this text

Getting a specific text from html using BeautifulSoup

Answers (2)

Related Questions