Beautifulsoup: Getting a new line when I tried to access the soup.head.next_sibling value with Beautifulsoup4

Question

I am trying an example from the BeautifulSoupDocs and found it acting weird. When I try to access the next_sibling value, instead of the "body" a ' ' is coming in to picture.

html_doc = """
The Dormouse's story

The Dormouse's story

Once upon a time there were three little sisters; and their names were
Elsie,
Lacie and
Tillie;
and they lived at the bottom of a well.

...
"""
from bs4 import BeautifulSoup
soup = BeautifulSoup(html_doc)
soup.head.next_sibling
u'
'

I am using latest version of beautifulSoup4. i.e 4.3.2. Please help me out. Thanks in advance.

alecxe · Accepted Answer

There are 3 kinds of objects that BeautifulSoup "sees" in the HTML:

Tag
NavigableString
Comment

When you get .next_sibling it returns you the next object after the current which, in your case, is a text node (NavigableString). Explained in the documentation here.

If you want to find the next Tag after the current, use find_next_sibling(), or, with specifying the tag name: find_next_sibling("body").

You can also use the "next sibling" CSS Selector:

soup.select("head + *")

Beautifulsoup: Getting a new line when I tried to access the soup.head.next_sibling value with Beautifulsoup4

Answers (2)

Related Questions