Preston G
Preston G

Reputation: 69

Scraping a particular link from a web page with BeautifulSoup

I am new to scraping and trying to scrape realtor data with beautiful soup from the following page: "https://www.realtor.com/realestateagents/New-Orleans_LA/pg-1".

I am currently returning the name and phone number for each realtor on the page by using selectors and storing them in a dictionary. I would also like to return a href value to store their personal page in the dictionary as well.

It looks like there are multiple 'a' tag classes of jsx-1448471805 and I only need to return one href value for each realtor.

The current selector I am looking at is is:

link_selectors = "#agent_list_wrapper > div.jsx-372421607.cardWrapper > ul > div:nth-child(1) > div > div > div.jsx-1448471805.agent-list-card-img-wrapper.col-lg-2.col-sm-3.col-xxs-4 > a"

But I am having no luck with this.

I am wondering how to find the right selector to pull just one of each realtor's href value to store in my current dictionary and how to add it to the dictionary 'realtors_data'.

Here is my current code:

from bs4 import BeautifulSoup
import requests
import numpy as np
import pandas as pd

realtors_data = {}
pages = np.arange(1, 2, 1)
print("PAGES: ", pages)
names_selector = "ul > div > div > div > div > div > a > div"
phone_selectors = "ul > div > div > div > div > div > div.jsx-1448471805.agent-phone.hidden-xs.hidden-xxs"
for page in pages:
    page = requests.get("https://www.realtor.com/realestateagents/New-Orleans_LA/pg-" + str(page))
    soup = BeautifulSoup(page.text, 'html.parser')
    names = soup.select(names_selector)
    phones = soup.select(phone_selectors)

    realtors = zip(names, phones)
    for name, phone in realtors:
        realtors_data[name.get_text()] = phone.get_text()


# Printing data
print(realtors_data)

Thank you!

Upvotes: 0

Views: 84

Answers (1)

Rob Raymond
Rob Raymond

Reputation: 31226

Looking at HTML it appears far simpler to navigate using HTML class

from bs4 import BeautifulSoup
import requests
url = "https://www.realtor.com/realestateagents/New-Orleans_LA/pg-1"
req = requests.get(url)
soup = BeautifulSoup(req.content, 'html.parser')
names = []
for m in soup.find_all("div", class_="agent-list-card"):
    names.append({"name":m.find("div", class_="agent-name").text,
                  "phone":m.find("div", class_="agent-phone").text,
                  "link":m.find("div", class_="agent-name").parent["href"]
                 })

names

output

[{'name': 'Cathy Nunez',
  'phone': '(504) 258-5410',
  'link': '/realestateagents/cathy-nunez___3736136_103289755'},
 {'name': 'Olivia Ford',
  'phone': '(504) 343-1837',
  'link': '/realestateagents/olivia-ford_new-orleans_la_1996916_140289755'},
 {'name': 'Michelle Pennino',
  'phone': '(985) 502-1787',
  'link': '/realestateagents/michelle-pennino_mandeville_la_589632_090714455'},
 {'name': 'Lana Hunt',
  'phone': '(225) 933-6459',
  'link': '/realestateagents/lana-hunt_new-orleans_la_2053719_682189755'},
 {'name': 'Nicole Schlaudecker',
  'phone': '(504) 455-0100',
  'link': '/realestateagents/nicole-schlaudecker_metairie_la_1793628_718289755'},
 {'name': 'Jason Minardi',
  'phone': '(985) 645-1275',
  'link': '/realestateagents/jason-minardi_slidell_la_1817940_385614455'},
 {'name': 'John P. Dixon III',
  'phone': '(504) 657-0820',
  'link': '/realestateagents/john-p.-dixon-iii___3088323_713979755'},
 {'name': 'LIZ ASHE',
  'phone': '(504) 401-4285',
  'link': '/realestateagents/liz-ashe_metairie_la_34409_054499755'},
 {'name': "Steven & Heidi Blount/Heidi's Homes, LLC",
  'phone': '(985) 373-6233',
  'link': "/realestateagents/steven-&-heidi-blount-heidi's-homes,-llc_mandeville_la_1369154_537614455"},
 {'name': 'Lisa Julien',
  'phone': '(504) 247-7306',
  'link': '/realestateagents/lisa-julien_new-orleans_la_2203901_038089755'},
 {'name': 'Bonnie Buras Team',
  'phone': '(504) 392-0022',
  'link': '/realestateagents/bonnie-buras-team_belle-chasse_la_18326_371699755'},
 {'name': 'Emily B. Hoskin',
  'phone': '(504) 392-0022',
  'link': '/realestateagents/emily-b.-hoskin_belle-chasse_la_1151586_725289755'},
 {'name': 'Emily Haynie',
  'phone': '(504) 430-6004',
  'link': '/realestateagents/emily-haynie___1055620_198489755'},
 {'name': 'Patrice Milton Poree',
  'phone': '(504) 372-1100',
  'link': '/realestateagents/patrice-milton-poree_new-orleans_la_786531_025589755'},
 {'name': 'Harry VarnadoreTeam',
  'phone': '(504) 450-6916',
  'link': '/realestateagents/harry-varnadore_new-orleans_la_992038_608489755'},
 {'name': 'Leslie Heindel',
  'phone': '(504) 975-4252',
  'link': '/realestateagents/leslie-heindel_new-orleans_la_2152401_967189755'},
 {'name': 'Heather Shields',
  'phone': '(504) 450-9672',
  'link': '/realestateagents/heather-shields_new-orleans_la_3033967_680089755'},
 {'name': 'Brittany Picolo-Ramos',
  'phone': '(504) 300-5179',
  'link': '/realestateagents/brittany-picolo-ramos_metairie_la_1949330_532289755'},
 {'name': 'Brenda Kiefer',
  'phone': '(504) 441-8171',
  'link': '/realestateagents/brenda-kiefer_covington_la_1985750_774389755'},
 {'name': 'Brenda Newfield',
  'phone': '(504) 228-6500',
  'link': '/realestateagents/brenda-newfield_st.-rose_la_1886770_176289755'}]

Upvotes: 1

Related Questions