user93353
user93353

Reputation: 14039

BeautifulSoup: Finding other elements after locating a div

Inside a page, I have the following HTML

<div class="ProfileDesc">
<p>
    <span class="Title">Name</span>
    <span>Tom Ready</span>
</p>
<p>
    <span class="Title">Born</span>

<span>
    <bxi> 10 Jan 1960</bxi> 
<p>
    <span class="Title">Death</span>
    <span>
        <bxi> 01 Jun 2019</bxi>
    </span>
</p>
</div>

The following code works for extracting the ProfileDesc from the whole page

soup = BeautifulSoup(page.content, 'html.parser')

mydivs = soup.find("div", {"class": "ProfileDesc"})

I want the following output

Name: Tom Ready
Born: 10 Jan 1960
Death: 01 Jun 2019

How do I extract these after finding the ProfileDesc?

Upvotes: 0

Views: 56

Answers (3)

Abhilash
Abhilash

Reputation: 2256

When you're pretty sure about the DOM structure:

mydivs = soup.find("div", {"class": "ProfileDesc"})

for element in mydivs.find_all("p"):
    title = element.find("span")
    content = title.findNext("span")
    print("%s : %s" % (title.text.strip(), content.text.strip()))

Output:

Name : Tom Ready
Born : 10 Jan 1960
Death : 01 Jun 2019

Upvotes: 1

Sergey Karbivnichiy
Sergey Karbivnichiy

Reputation: 116

Your html code after " 10 Jan 1960 " has no end p tag

name = soup.find('span',string='Name').parent.text.replace('Name','').strip()
born = soup.find('span',string='Born').parent.text.replace('Born','').strip()
death = soup.find('span',string='Death').parent.text.replace('Death','').strip()
print(f'Name: {name}')
print(f'Born: {born}')
print(f'Death: {death}')

Upvotes: 2

sushanth
sushanth

Reputation: 8302

try this,

keys_ = set() # avoid duplicate keys

for p in mydivs.find_all("p"):
    ss = list(p.stripped_strings)

    for k, v in zip(ss[::2], ss[1::2]):
        if k in keys_:
            continue
            
        keys_.add(k)
        print(k, ":", v)

Name : Tom Ready
Born : 10 Jan 1960
Death : 01 Jun 2019

Upvotes: 1

Related Questions