Reputation: 5228
I'm trying to use beautiful soup to return the number of datasets there are on this website.
However, I'm not sure what is wrong with my code.
I can't seem to extract just the number of datasets. (datasets is 3908)
base_url = www.quandl.com/data/TSE
web_content = BeautifulSoup(requests.get(base_url).text, "html.parser")
for stats in web_content.findAll('table', attrs={'class'}):
print(stats)
How should i structure my code?
Upvotes: 0
Views: 65
Reputation: 5972
Try:
attrs={'class' : ''}
So you have:
from bs4 import BeautifulSoup
import requests
base_url = 'http://www.quandl.com/data/TSE'
web_content = BeautifulSoup(requests.get(base_url).text, "html.parser")
for stats in web_content.findAll('table', attrs={'class' : ''}):
print(stats)
Note: If your target supports javascript, requests
is not a good idea, You can Try PhantomJS instead.
Edit:
from lxml import html
import requests
base_url = 'http://www.quandl.com/data/TSE'
web_content = requests.get(base_url).text
tree = html.fromstring(web_content)
print tree.xpath('//tr/td/text()')[3]
Upvotes: 1