pKa
pKa

Reputation: 11

Web crawling using python beautifulsoup

How to extract data that is inside <p> paragraph tags and <li> which are under a named <div> class?

Upvotes: 0

Views: 1723

Answers (1)

pp_
pp_

Reputation: 3489

Use the functions find() and find_all():

import requests
from bs4 import BeautifulSoup

url = '...'

r = requests.get(url)
data = r.text
soup = BeautifulSoup(data, 'html.parser')

div = soup.find('div', {'class':'class-name'})
ps = div.find_all('p')
lis = div.find_all('li')

# print the content of all <p> tags
for p in ps:
    print(p.text)

# print the content of all <li> tags
for li in lis:
    print(li.text)

Upvotes: 3

Related Questions