Reputation: 51
I am new to web scraping. I am using Python to scrape the data. Can someone help me in how to extract data from:
<div class="dept"><strong>LENGTH:</strong> 15 credits</div>
My output should be LENGTH: 15 credits
Here is my code:
from urllib.request import urlopen
from bs4 import BeautifulSoup
length=bsObj.findAll("strong")
for leng in length:
print(leng.text,leng.next_sibling)
Output:
DELIVERY: Campus
LENGTH: 2 years
OFFERED BY: Olin Business School
but I would like to have only LENGTH.
Website: http://www.mastersindatascience.org/specialties/business-analytics/
Upvotes: 5
Views: 14305
Reputation: 33
If someone still looks for this, here is the example:
age = soup.find('div', class_ = 'item-birthday').find('strong').get_text()
this means, get the strong element which is inside the div.
Upvotes: -1
Reputation: 473833
You should improve your code a bit to locate the strong
element by text:
soup.find("strong", text="LENGTH:").next_sibling
Or, for multiple lengths:
for length in soup.find_all("strong", text="LENGTH:"):
print(length.next_sibling.strip())
Demo:
>>> import requests
>>> from bs4 import BeautifulSoup
>>>
>>> url = "http://www.mastersindatascience.org/specialties/business-analytics/"
>>> response = requests.get(url)
>>> soup = BeautifulSoup(response.content, "html.parser")
>>> for length in soup.find_all("strong", text="LENGTH:"):
... print(length.next_sibling.strip())
...
33 credit hours
15 months
48 Credits
...
12 months
1 year
Upvotes: 6