Reputation: 3661
I'm newbie to python. Here is my code working on python 2.7.5
import urllib2
import sys
url ="mydomain.com"
usock = urllib2.urlopen(url)
data = usock.read()
usock.close()
print data
Getting HTML markup like that and it works.
What I want to do is, to get value from inside <font class="big"></font>
tag. for ex. I need data value from this example:
<font class="big">Data</font>
How to do it?
Upvotes: 5
Views: 20332
Reputation: 369064
Using lxml
:
import urllib2
import lxml.html
url ="mydomain.com"
usock = urllib2.urlopen(url)
data = usock.read()
usock.close()
for font in lxml.html.fromstring(data).cssselect('font.big'):
print font.text
>>> import lxml.html
>>> root = lxml.html.fromstring('<font class="big">Data</font>')
>>> [font.text for font in root.cssselect('font.big')]
['Data']
Upvotes: 1
Reputation: 59974
You can use a HTML parser module such as BeautifulSoup
:
from bs4 import BeautifulSoup as BS
url ="mydomain.com"
usock = urllib2.urlopen(url)
data = usock.read()
usock.close()
soup = BS(data)
print soup.find('font', {'class':'big'}).text
This finds a tag <font>
with a class="big"
. It then prints its content.
Upvotes: 9