AndrewF
AndrewF

Reputation: 59

BeautifulSoup not finding tr id

I've run this web scraping exercise using the requests and BeautifulSoup module in python 2.7.12. My problem is that I can't seem to get the soup object to return a specific tr based on the id, as well as a few other html elements with id that I've picked at random including the ones in the below print statements. Any idea why that's not working? Any help would be greatly appreciated.

import requests
from bs4 import BeautifulSoup as bs

head= {
'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/46.0.2490.80 Safari/537.36',
'Content-Type': 'text/html',}

r = requests.get('http://www.iii.co.uk/investment/detail?code=cotn:LSE:SEE&display=discussion', headers=head)

r_text = r.text
soup = bs(r_text, "html.parser")

print soup.find("tr",id="disc1-12056888")
print soup.find('table', id='discussion-list')

Upvotes: 0

Views: 1154

Answers (2)

宏杰李
宏杰李

Reputation: 12178

I believe html.parser is unstable is python2, use lxml or html5lib

soup = bs(r_text, "lxml")

This quote is from Document:

If you can, I recommend you install and use lxml for speed. If you’re using a version of Python 2 earlier than 2.7.3, or a version of Python 3 earlier than 3.2.2, it’s essential that you install lxml or html5lib–Python’s built-in HTML parser is just not very good in older versions.

Upvotes: 2

Rafael Aguilar
Rafael Aguilar

Reputation: 3279

@AndrewF:

I'd suggest you to use PyQuery for simpler tasks as extract comments, here is a snippet to show the simplicity of it:

import requests
import pyquery

head= {
'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/46.0.2490.80 Safari/537.36',
'Content-Type': 'text/html',}

r = requests.get('http://www.iii.co.uk/investment/detail?code=cotn:LSE:SEE&display=discussion', headers=head)

r_text = r.text
pq = pyquery.PyQuery(r_text)

for a in pq('tr.comment div'):
    if a.text.strip():
        print(a.text.strip())

Upvotes: 1

Related Questions