BeautifulSoup findAll with name and text

Question

I'm trying to match the TH tag in the below HTML (file.txt):



Name
 
 
Age
 
Positions


Stephen A. Wynn
 
60
 
Chairman of the Board and Chief Executive Officer


Kazuo Okada
 
60
 
Vice Chairman of the Board

I have tried the following, but it doesn't seem to work:

from bs4 import BeautifulSoup

infile = open("file.txt")
soup = BeautifulSoup(infile.read())
#this works
soup.findAll('th')
#this works but isn't particularly useful...
soup.findAll(text="Age")
#this is what I really want, but it returns an empty list
soup.findAll('th', text="Age")

Thanks for the help!

TimD · Accepted Answer

As far as I can tell, you want to get the th object which has the text "Age". There are many ways to skin that cat, basically starting at finding all the th's. From there you can iterate over all of them to find the one that contains age. So the code below should be useful.

out = []
foo = soup.findAll("th")
for bar in foo:
    if bar.find(text"Age"):
        out.append(bar)

BeautifulSoup findAll with name and text

Answers (2)

Related Questions