imns
imns

Reputation: 5082

python beautifulsoup adding extra end tags

I'm using Beautifulsoup to parse a website

  request = urllib2.Request(url)
  response = urllib2.urlopen(request)
  soup = BeautifulSoup.BeautifulSoup(response)

I am using it to traverse a table. The problem I am running into is that BS is adding an extra end tag for the table into the html which doesn't exist, which I verified with: print soup.prettify(). So, one of the td tags is getting left out of the table and I can't select it.

Upvotes: 1

Views: 904

Answers (1)

ebt
ebt

Reputation: 1358

How about searching directly for each tag instead of trying to traverse into the table?

   for td in soup.find("td"):
        ...

its not unusual to find the tbody tag nested within a table automatically when its not in the code. Either you can code for it or just jump straight to the tr or td tag.

Upvotes: 1

Related Questions