mrGreenBrown
mrGreenBrown

Reputation: 596

Scrape value based on a sibling content using BeautifulSoup

Suppose I have a code of the following format in HTML:

...
<div class="class1">
    <div class="subclass1">Text1</div>
    <div class="subclass2">Text2</div>  
</div>
<div class="class1">
    <div class="subclass1">Text3</div>
    <div class="subclass2">Text4</div>  
</div>
<div class="class1">
    <div class="subclass1">Text5</div>
    <div class="subclass2">Text6</div>  
</div>
...

How can I extract Text2 based on the Text1?

I have several ideas, but all include complex structure with a loop and conversion between a list and bs Series. Any ideas?

Upvotes: 1

Views: 261

Answers (1)

宏杰李
宏杰李

Reputation: 12158

text2 = soup.find('div', text='Text1').find_next('div').text
           # |-----find div tag-------|-get next div tag-|                 

out:

'Text2'

or:

soup.find('div', text='Text1').next_sibling.next_element.text

this is not recommended

Upvotes: 2

Related Questions