Reputation: 33
I am scraping the information from this page:
https://lawyers.justia.com/lawyer/michael-paul-ehline-85006 .
I am trying to scrape all the information in under the fees section.
What I want is the following information:
Free Consultation
Yes
Credit Cards Accepted
Visa, Mastercard, American Express
Contingent Fees
In personal injury cases only.
Rates, Retainers and Additional Information
Rates vary on a case by case basis.
This is what I have tried:
for thing in soup.findAll('ul', attrs={"class": "has-no-list-styles"}):
ul=thing.find('<li>')
print(ul)
but the output is:
<li>Intellectual Property</li>
<li>Copyright Law</li>
<li><strong>English</strong></li>
Thank you in advance.
UPDATE: I found a solution but it gives me an infinite loop, any suggestions?
for o in soup.findAll('div', attrs={"class": "block-wrapper"}):
for tag in soup.findAll('div', attrs={"class": "block-wrapper"}):
if tag.string:
tag.string.replace_with("")
for de in o.findAll("li"):
if de != []:
de=remove_tags(str(de))
print (de)
Upvotes: 1
Views: 61
Reputation: 441
Try this soup. It was inspired by dabinsous answer. All it does is look for the icon that he detailed, then go to its parent's next sibling, and from there grab that siblings text.
import requests
from bs4 import BeautifulSoup
URL = "https://lawyers.justia.com/lawyer/michael-paul-ehline-85006"
r = requests.get(URL)
soup = BeautifulSoup(r.content, 'html.parser')
uls = soup.find('span', attrs={"class": "jicon -large jicon-fee"})
print(uls.parent.nextSibling.text)
Adjust your scraping to meet that, and see if that helps!
Upvotes: 0
Reputation: 2469
Try this.
from simplified_scrapy import SimplifiedDoc,req
html = req.get('https://lawyers.justia.com/lawyer/michael-paul-ehline-85006')
doc = SimplifiedDoc(html)
ul = doc.getElement('ul',attr='class',value='has-no-list-styles',start='class="jicon -large jicon-fee"') # Use class="jicon -large jicon-fee" to locate
print (ul.text)
Result:
Free ConsultationYesCredit Cards AcceptedVisa, Mastercard, American ExpressContingent FeesIn personal injury cases only.Rates, Retainers and Additional InformationRates vary on a case by case basis.
Upvotes: 1