Deepak Nath
Deepak Nath

Reputation: 101

Extracting a section from web page using python

I want to extract the section of test for the section symptoms from the website below using python and lxml. Can anyone please help.

http://www.ncbi.nlm.nih.gov/pubmedhealth/PMH0001851/

Thank you,

Upvotes: 0

Views: 213

Answers (1)

JKirchartz
JKirchartz

Reputation: 18042

You want to Scrape a webpage with lxml? try this:

 from lxml.html import parse
 doc = parse("http://www.ncbi.nlm.nih.gov/pubmedhealth/PMH0001851/").getroot()
 for h2 in doc.cssselect('h2'):
     print h2.text_content()

this will open up grab the h2s from your page.

Upvotes: 1

Related Questions