Pujitha Gangarapu
Pujitha Gangarapu

Reputation: 51

How to extract the strong elements which are in div tag

I am new to web scraping. I am using Python to scrape the data. Can someone help me in how to extract data from:

<div class="dept"><strong>LENGTH:</strong> 15 credits</div>

My output should be LENGTH: 15 credits

Here is my code:

from urllib.request import urlopen
from bs4 import BeautifulSoup 

length=bsObj.findAll("strong")
for leng in length:
    print(leng.text,leng.next_sibling)

Output:

DELIVERY:  Campus
LENGTH:  2 years
OFFERED BY:  Olin Business School

but I would like to have only LENGTH.

Website: http://www.mastersindatascience.org/specialties/business-analytics/

Upvotes: 5

Views: 14305

Answers (2)

Aayush Khawaja
Aayush Khawaja

Reputation: 33

If someone still looks for this, here is the example: age = soup.find('div', class_ = 'item-birthday').find('strong').get_text() this means, get the strong element which is inside the div.

Upvotes: -1

alecxe
alecxe

Reputation: 473833

You should improve your code a bit to locate the strong element by text:

soup.find("strong", text="LENGTH:").next_sibling

Or, for multiple lengths:

for length in soup.find_all("strong", text="LENGTH:"):
    print(length.next_sibling.strip())

Demo:

>>> import requests
>>> from bs4 import BeautifulSoup
>>>
>>> url = "http://www.mastersindatascience.org/specialties/business-analytics/"
>>> response = requests.get(url)
>>> soup = BeautifulSoup(response.content, "html.parser")
>>> for length in soup.find_all("strong", text="LENGTH:"):
...     print(length.next_sibling.strip())
... 
33 credit hours
15 months
48 Credits
...
12 months
1 year

Upvotes: 6

Related Questions