Extracting with .find() the second of 2 identical 'div' from html page with BS4

Question

I'm trying to extract the second of 2 identical 'div' from a a soup element. When parsing trough and extracting with the .find() method, it gets exclusively the first from the top. How can I tell the script to skip the first and get the next one if some conditions are met? Here below is the html code I want to extract from.

MPAA Rating: PG (Parental Guidance Suggested)








$0.00 with a CONtv trial on Prime Video Channels

This is the code I'm trying:

if '$' not in str(product.find('div', {'class': 'a-row a-size-base a-color-secondary'})):
    print('NOT IN')
    pass
    price = product.find('div', {'class': 'a-row a-size-base a-color-secondary'})
    print(price)
else:
    price = product.find('div', {'class': 'a-row a-size-base a-color-secondary'})
    print(price)

However as results it still gives me this:

NOT IN
MPAA Rating: PG (Parental Guidance Suggested)

Rather then this:

$0.00 with a CONtv trial on Prime Video Channels

Any suggestions?

QHarr · Accepted Answer

You need find_all then index into returned list as find only ever returns first match. You can do same thing with select. With bs4 4.7.1. you can use :contains to target innerText of element by a substring (e.g. CONtv trial) and then use select_one if first match wanted or select if multiple matches. You want to test if None first before attempting to access .text

from bs4 import BeautifulSoup as bs
import requests

html = '''
MPAA Rating: PG (Parental Guidance Suggested)








$0.00 with a CONtv trial on Prime Video Channels

'''
soup = bs(html, 'lxml')
print(soup.find_all('div', {'class': 'a-row a-size-base a-color-secondary'})[1].text)
print(soup.select('.a-color-secondary')[1].text)
print(soup.select_one('.a-color-secondary:contains("CONtv trial")').text)

Looping with find_all

matches = soup.find_all('div', {'class': 'a-row a-size-base a-color-secondary'})
for item in matches:
    if '$' in str(item):
        print(item.text)

Extracting with .find() the second of 2 identical 'div' from html page with BS4

Answers (2)

Related Questions

Extracting with .find() the second of 2 identical &#39;div&#39; from html page with BS4

Answers (2)

Related Questions

Extracting with .find() the second of 2 identical 'div' from html page with BS4