BS4 | Python | Scraping specific data in div with multiple values

Question

I'm pretty new to Python, but I've started experimenting with web scraping with BS4 with some success, but I'm now on a new personal project where I'm indexing AutoTrader from an HTML file.

So far I'm able to scrape all the values I need, but one. I've searched and can't find a solution

I need to extract the province "BC" from data-payment-province="BC" from the below code

I've used location = soup.find_all('div', class_='data-payment-province')

but it returns []

Idk, I'm probably being dumb and missing something obvious but I'm honestly so stumped.

Also, I should probably ask this in another question. But does anyone know how to only get the values as output instead of the HTML and Values?

e.x.

Current:

itemOffered = soup.find_all("span", itemprop="itemOffered")

OUTPUT:

, 
2019 Hyundai Elantra GT | Bluetooth | Backup Camera | Heated Seats | Blind

Desired OUTPUT:

2019 Hyundai Elantra GT

Rickey · Accepted Answer

Give this a shot for your first problem:

import requests
from bs4 import BeautifulSoup
import re

.....

province_re = re.compile(r'[A-Z]{2}')

location = soup.find_all('div', {'data-payment-province': province_re})

for loc in location:
    print(loc.attrs['data-payment-province'])

BS4 | Python | Scraping specific data in div with multiple values

Answers (2)

Related Questions