Python + BeautifulSoup - Text extraction by searching criteria

Question

A file contains HTML codes like below (the words 'Registration' and 'Flying' are fixed in the following paragraphs):


Registration
02 Mar 2006


Flying
24 Jun 2005

I want to extract them and put as:

Registration 02 Mar 2006

Flying 24 Jun 2005

I am using the BeautifulSoup find_next_sibling however it returns nothing. What’s went wrong?

from bs4 import BeautifulSoup

url = r"C:\example.html"
page = open(url)
soup = BeautifulSoup(page.read())

aa = soup.find_next_sibling(text='Registration')

print aa

loki · Accepted Answer

Try this

soup.find(text="Registration").findNext('td').contents[0]

Python + BeautifulSoup - Text extraction by searching criteria

Answers (2)

Related Questions