xpath - how extract html from one tag?

Question

I need to extract html tags with text from one tag on page. For example:


 
  
   text  text 
 text  text 
    text 
  
   another text  text

I need html inside first

:

text  text 
 text  text 
    text

with tags.

I can extract only text with xpath: "(//div[@class="post"])[1]/descendant-or-self::*[not(name()="script")]/text()" result = text text text text text

I tried: "(//div[@class="post_body"])[1]/node()" But I don't know how create string from this.

P.S. Or prompt another way, for example (BeautifulSoup) Please, help.

Sede · Accepted Answer

Use the find() method to get the first div.

from bs4 import BeautifulSoup   
soup = BeautifulSoup("""
     
      
       text  text 
 text  text 
        text 
      
       another text  text 
     
    """)

first_div_text = [child.strip() if isinstance(child, str) else str(child)  for child in soup.find('div', attrs={'class': 'post'})]
print(''.join(first_div_text))

Output

text text 
text text  text

xpath - how extract html from one tag?

Answers (1)

Related Questions