Reputation: 1178
I am trying to scrape this site : http://stcw.marina.gov.ph/find/?c_n=14-111112&opt=stcw and get the table at the bottom. When I try to scrape it, I get some elements of the first row, but nothing from the rest of the table. Here is my code
urlText = "http://stcw.marina.gov.ph/find/?c_n=14-111112&opt=stcw"
url = urlopen(urlText)
soup = bs.BeautifulSoup(url,"html.parser")
certificates = soup.find('table',class_='table table-bordered')
for row in certificates.find_all('tr'):
for td in row.find_all('td'):
print td.text
What I get as an output is:
22-20353
SHIP SECURITY OFFICER
Rather than the whole table. What am I missing ?
Upvotes: 1
Views: 80
Reputation: 474221
It is yet another case of when an underlying parser makes a difference. Switch to lxml
or html5lib
to see the complete table parsed:
soup = bs.BeautifulSoup(url, "lxml")
soup = bs.BeautifulSoup(url, "html5lib")
Upvotes: 1