semiflex
semiflex

Reputation: 1246

Scraping HTML data from website in Python

I'm trying to scrape certain pieces of HTML data from certain websites, but I can't seem to scrape the parts I want. For instance I set myself the challenge of scraping the number of followers from this blog, but I can't seem to do so.

I've tried using urllib, request, beautifulsoup as well as Jam API.

Here's what my code looks like at the moment:

from bs4 import BeautifulSoup
from urllib import urlopen
import json
import urllib2

html = urlopen('http://freelegalconsultancy.blogspot.co.uk/')
soup = BeautifulSoup(html, "lxml")
print soup

How would I go about pulling the number of followers in this instace?

Upvotes: 0

Views: 84

Answers (1)

jmoz
jmoz

Reputation: 8006

You can't grab the followers as it's a widget loaded by javascript. You need to grab parts of the html by css class or id or by the element.

E.g:

from bs4 import BeautifulSoup
from urllib import urlopen

html = urlopen('http://freelegalconsultancy.blogspot.co.uk/')
soup = BeautifulSoup(html)

assert soup.h1.string == '\nLAW FOR ALL-M.MURALI MOHAN\n'

Upvotes: 1

Related Questions