Reputation: 95
I'm able to scrape the company's name, location. Using the code below. But I'm facing difficulty to scrape the number of followers
Here is the HTML script for reference.
<div class="block mt2">
<div>
<h1 class="ember-view t-24 t-black t-bold full-width" id="ember28" title="Pacific Retail Capital Partners">
<span dir="ltr">Pacific Retail Capital Partners</span>
</h1>
<p class="org-top-card-summary__tagline t-16 t-black">
Our decades of experience and innovative strategies are transforming retail-led centers into high-performing properties.
</p>
<!-- -->
<div class="org-top-card-summary-info-list t-14 t-black--light">
<div class="org-top-card-summary-info-list__info-item">
Leasing Non-residential Real Estate
</div>
<!-- -->
<div class="inline-block">
<div class="org-top-card-summary-info-list__info-item">
El Segundo, CA
</div>
<!-- -->
<div class="org-top-card-summary-info-list__info-item">
4,047 followers
</div>
</div>
</div>
</div>
</div>
Scrapping company's name was easy and direct
info_div = soup.find('div', {'class' : 'block mt2'})
#print(info_div)
info_name = info_div.find_all('h1')
company_name = info_name[0].get_text().strip()
print(company_name, type(company_name),len(company_name))
Company location was accessed using this.
info_block = info_div.find_all('div', {'class' : 'inline-block'})
info_loc = info_block[0].find('div', {'class' : 'org-top-card-summary-info-list__info-item'}).get_text().strip()
print(info_loc)
How can I scrape/access second element i.e. 4047 followers ?
Upvotes: 0
Views: 628
Reputation: 533
You can use the contains operator within the CSS selector, in this case we're searching for followers in a div with the specified class name:
followers_div = soup.select_one('.org-top-card-summary-info-list__info-item:contains(followers)')
This returns:
<div class="org-top-card-summary-info-list__info-item">
4,047 followers
</div>
Upvotes: 1