How to select a specific element using BeautifulSoup

Question

I am trying to scrape some profile informations on linkedIn. I came across an html structure with this kind of layout and need to select this "Abeokuta, Ogun State" only and disregard "Contract".

This is a page sample: https://www.linkedin.com/in/habibulah-oyero-44069a193/

html structure


      Abeokuta, Ogun State
      Contract

python code

from bs4 import BeautifulSoup

src = browser.page_source
soup = BeautifulSoup(src, "lxml")

experience_div = soup.find("section", {"id": "experience-section"})

job_div = experience_div.find("div", {"class": "pv-entity__summary-info pv-entity__summary-info--background-section"})

job_location = job_div.find("p", {"class": "pv-entity__secondary-title"}).text.strip()

print(job_location)

This returns:

Abeokuta, Ogun State
        Contract

MendelG · Accepted Answer

To only get the first tag, you can use the .find_next() method which will only return the first match:

from bs4 import BeautifulSoup


html = """
      Abeokuta, Ogun State
      Contract
"""

soup = BeautifulSoup(html, "html.parser")

print(
    soup.find("p", class_="pv-entity__secondary-title t-14 t-black t-normal")
    .find_next(text=True)
    .strip()
)

Or: You can use .contents:

print(
    soup.find("p", class_="pv-entity__secondary-title t-14 t-black t-normal")
    .contents[0]
    .strip()
)

Output (in both solutions):

Abeokuta, Ogun State

How to select a specific element using BeautifulSoup

Answers (1)

Related Questions