Reputation: 6668
I am trying to learn python and Portuguese so thought I could kill two birds with one stone.
Here is an example of one of the pages. I want to download the data that is in the blue tables, so the first such table is called Presente the next table is called Pretérito Perfeito and so on.
Below is my code however I'm struggling. My results variable does contain the data I need however trying to pull out the exact bit is beyond me as the div tags don't have id's.
Is there a better way to do this?
import requests
from bs4 import BeautifulSoup
URL = 'https://conjugator.reverso.net/conjugation-portuguese-verb-ser.html'
page = requests.get(URL)
soup = BeautifulSoup(page.content, 'html.parser')
results = soup.find(id='ch_divSimple')
mychk = results.prettify()
tbl_elems = results.find_all('section', class_='wrap-verbs-listing')
Upvotes: 0
Views: 50
Reputation: 1249
They don't have ids but they have classes. You can do:
results.find_all("div", "blue-box-wrap")
Where blue-box-wrap
is a class.
It will return a ResultSet
object of length 22, as there are 22 blue tables. You can select the one you want with indexing, like this for the first one:
blue_tables = results.find_all("div", "blue-box-wrap")
blue_tables[0]
Upvotes: 1
Reputation: 71689
Replace:
results = soup.find(id='ch_divSimple')
mychk = results.prettify()
tbl_elems = results.find_all('section', class_='wrap-verbs-listing')
With:
results = soup.find("div", attrs={"class": 'blue-box-wrap'})
tbl_elems = results.find_all('ul', class_='wrap-verbs-listing')
Upvotes: 1