john
john

Reputation: 37

beautifulsoup find text between span

I want to get just a text from span:

html = <a class="business-name" data-analytics='{"click_id":1600,"target":"name","feature_click":""}' href="/new-york-ny/bpp/upper-eastside-orthodontists-20151" rel=""><span>Upper Eastside Orthodontists</span></a>

        name = html.find('a', {'class', 'business-name'})
        print(name.find('span').text)

give me results:

    print(name.find('span').text)
  AttributeError: 'NoneType' object has no attribute 'text'

I want to get just the text: Upper Eastside Orthodontists

Upvotes: 0

Views: 72

Answers (1)

chitown88
chitown88

Reputation: 28640

What you are actually looking for is not in the static/initial request. The page is rendered dynamically.

Luckily the data does come in under the <script> tags, and you can pull out the json and parse it from there:

import requests
from bs4 import BeautifulSoup
import re
import json
import pandas as pd


url = 'https://www.superpages.com/new-york-ny/dentists?page=1'
response = requests.get(url)
soup = BeautifulSoup(response.text, 'html.parser')

script = soup.find_all('script', {'type':"application/ld+json"})[-2]

p = re.compile('({.*})')
result = p.search(str(script))

data = json.loads(result.group(0))

df = pd.DataFrame(data['mainEntity']['itemListElement'])

Output:

print(df.to_string())
       @type                                         name                                                                                               url
0   ItemPage                 Upper Eastside Orthodontists                     https://www.superpages.com/new-york-ny/bpp/upper-eastside-orthodontists-20151
1   ItemPage                                         Kara                                           https://www.superpages.com/new-york-ny/bpp/kara-5721648
2   ItemPage                  Central Park West Dentistry                  https://www.superpages.com/new-york-ny/bpp/central-park-west-dentistry-471054528
3   ItemPage  Majid Rajabi Khamesi Advanced Family Dental  https://www.superpages.com/new-york-ny/bpp/majid-rajabi-khamesi-advanced-family-dental-542761105
4   ItemPage                     Robert Veligdan, DMD, PC                        https://www.superpages.com/new-york-ny/bpp/robert-veligdan-dmd-pc-21238912
5   ItemPage                         Irina Rossinski, DDS                          https://www.superpages.com/new-york-ny/bpp/irina-rossinski-dds-462447740
6   ItemPage                           Dr. Michael J. Wei                             https://www.superpages.com/new-york-ny/bpp/dr-michael-j-wei-504012551
7   ItemPage                         Manhattan Dental Spa                          https://www.superpages.com/new-york-ny/bpp/manhattan-dental-spa-22612348
8   ItemPage                             Expert Dental PC                             https://www.superpages.com/new-york-ny/bpp/expert-dental-pc-459327373
9   ItemPage             Dr. Jonathan Freed, D.D.S., P.C.                  https://www.superpages.com/new-york-ny/bpp/dr-jonathan-freed-d-d-s-p-c-503142997
10  ItemPage                  Clifford S. Melnick, DMD PC                    https://www.superpages.com/new-york-ny/bpp/clifford-s-melnick-dmd-pc-512698216
11  ItemPage                          Ronald Birnbaum Dds                            https://www.superpages.com/new-york-ny/bpp/ronald-birnbaum-dds-2757412
12  ItemPage                        Concerned Dental Care                        https://www.superpages.com/new-york-ny/bpp/concerned-dental-care-453434343
13  ItemPage              DownTown Dental Cosmetic Center              https://www.superpages.com/new-york-ny/bpp/downtown-dental-cosmetic-center-468569119
14  ItemPage                         Beth Caunitz, D.D.S.                           https://www.superpages.com/new-york-ny/bpp/beth-caunitz-d-d-s-479935675
15  ItemPage                       Alice Urbankova DDS, P                        https://www.superpages.com/new-york-ny/bpp/alice-urbankova-dds-p-474879958
16  ItemPage                             Wu Darryl DDS PC                               https://www.superpages.com/new-york-ny/bpp/wu-darryl-dds-pc-8291524
17  ItemPage                             Gerald Rosen DDS                             https://www.superpages.com/new-york-ny/bpp/gerald-rosen-dds-470302208
18  ItemPage                          Group Health Dental                           https://www.superpages.com/new-york-ny/bpp/group-health-dental-15648711
19  ItemPage                       Dr. Shaun Massiah, DMD                         https://www.superpages.com/new-york-ny/bpp/dr-shaun-massiah-dmd-453290181
20  ItemPage                               Park 56 Dental             https://www.superpages.com/new-york-ny/bpp/park-56-dental-479624928?lid=1001970746762
21  ItemPage                               Rubin Esther S                               https://www.superpages.com/new-york-ny/bpp/rubin-esther-s-462458952
22  ItemPage                           David P Pitman DMD                             https://www.superpages.com/new-york-ny/bpp/david-p-pitman-dmd-9139813
23  ItemPage                   Daniell Jason Mishaan, DMD                    https://www.superpages.com/new-york-ny/bpp/daniell-jason-mishaan-dmd-479623764
24  ItemPage                          Dolman Oral Surgery                          https://www.superpages.com/new-york-ny/bpp/dolman-oral-surgery-534333982
25  ItemPage                                Emagen Dental                                https://www.superpages.com/new-york-ny/bpp/emagen-dental-460512214
26  ItemPage                    The Exchange Dental Group                    https://www.superpages.com/new-york-ny/bpp/the-exchange-dental-group-462981940
27  ItemPage            Joshua M. Wilges DDS & Associates               https://www.superpages.com/new-york-ny/bpp/joshua-m-wilges-dds-associates-497873451
28  ItemPage                           Oren Rahmanan, DDS                            https://www.superpages.com/new-york-ny/bpp/oren-rahmanan-dds-472633138
29  ItemPage                       Victoria Veytsman, DDS                        https://www.superpages.com/new-york-ny/bpp/victoria-veytsman-dds-456826960

You could then iterate through each link to get the data from their page.

The other option which is a little tricky is I did find it within the html. It's only tricky in that you need to cut out the excess (there's the sponsor ad, and then more after the initial 30 results, that don't follow the same html structure/pattern)

import requests
from bs4 import BeautifulSoup
import re
import json
import pandas as pd


url = 'https://www.superpages.com/new-york-ny/dentists?page=1'
response = requests.get(url)
soup = BeautifulSoup(response.text, 'html.parser')

businesses = soup.find_all('a', {'class':'business-name'})
rows = []
for each in businesses[1:31]:
    name = each.text
    address = each.find_next('div', {'class':'street-address'}).text
    phone = each.find_next('a', {'class':'phones phone primary'}).text.replace('Call Now','')
    
    row = {'name':name,
           'address':address,
           'phone':phone}
    
    rows.append(row)
    
df = pd.DataFrame(rows)

Output:

print(df.to_string())
                                           name                                               address         phone
0                  Upper Eastside Orthodontists            153 E 87th St Apt 1b, New York, NY, 10128   888-378-2976
1                                          Kara             30 E 60th St Rm 503, New York, NY, 10022   212-355-2195
2                   Central Park West Dentistry                    25 W 68th St, New York, NY, 10023   212-579-8885
3   Majid Rajabi Khamesi Advanced Family Dental             30 E 40th St Rm 705, New York, NY, 10016   212-481-2535
4                      Robert Veligdan, DMD, PC                   343 W 58th St, New York, NY, 10019   212-832-2330
5                          Irina Rossinski, DDS               30 5th Ave Apt 1g, New York, NY, 10011   212-673-3700
6                            Dr. Michael J. Wei      425 Madison Ave.20th Floor, New York, NY, 10017   646-798-6490
7                          Manhattan Dental Spa        200 Madison Ave Ste 2201, New York, NY, 10016   212-683-2530
8                              Expert Dental PC            110 E 40th St Rm 104, New York, NY, 10016   212-682-2965
9              Dr. Jonathan Freed, D.D.S., P.C.          315 Madison Ave Rm 509, New York, NY, 10017   212-682-5644
10                  Clifford S. Melnick, DMD PC             41 W 58th St Apt 2e, New York, NY, 10019   212-355-1266
11                          Ronald Birnbaum Dds                   425 W 59th St, New York, NY, 10019   212-523-8030
12                        Concerned Dental Care             30 E 40th St Rm 207, New York, NY, 10016   212-696-4979
13              DownTown Dental Cosmetic Center                    160 Broadway, New York, NY, 10038   212-964-3337
14                         Beth Caunitz, D.D.S.  30 East 40th Street, Suite 406, New York, NY, 10016   212-206-9002
15                       Alice Urbankova DDS, P            630 5th Ave Ste 1860, New York, NY, 10111   212-765-7340
16                             Wu Darryl DDS PC                 41 Elizabeth St, New York, NY, 10013   212-925-7757
17                             Gerald Rosen DDS                    59 E 54th St, New York, NY, 10022   212-753-9860
18                          Group Health Dental                   230 W 41st St, New York, NY, 10036   212-398-9690
19                       Dr. Shaun Massiah, DMD             50 W 97th St Apt 1c, New York, NY, 10025   212-222-5225
20                               Park 56 Dental            120 E 56th St Rm 610, New York, NY, 10022   347-770-3915
21                               Rubin Esther S                    18 E 48th St, New York, NY, 10017   212-593-7272
22                           David P Pitman DMD            57 W 57th St Ste 707, New York, NY, 10019   212-888-2833
23                   Daniell Jason Mishaan, DMD                   241 W 37th St, New York, NY, 10018   212-730-4440
24                          Dolman Oral Surgery            16 E 52nd St Ste 402, New York, NY, 10022   212-696-0167
25                                Emagen Dental              250 8th Ave Apt 2s, New York, NY, 10011   212-352-9300
26                    The Exchange Dental Group             39 Broadway Rm 2115, New York, NY, 10006   212-422-9229
27            Joshua M. Wilges DDS & Associates   2 West 45th Street Suite 1708, New York, NY, 10036   646-590-2100
28                           Oren Rahmanan, DDS       1 Rockefeller Plz Rm 2223, New York, NY, 10020   212-581-6736
29                       Victoria Veytsman, DDS         509 Madison Ave Rm 1704, New York, NY, 10022   212-759-6700

Upvotes: 2

Related Questions