Reputation: 137
I have a html document with the structure:
<!DOCTYPE html>
<html>
<body>
<p>One</p>
<p>Two</p>
<p>Three</p>
</body>
</html>
Advise module for Python, with which I can make:
var = ModuleName.html.bode.p2
print(var)
Two
Upvotes: 0
Views: 186
Reputation: 816
I would recommend you use BeautifulSoup to parse your HTML and extract the content you want with css selectors.
You can find an example of something very similar to what you want to do in the documentation : http://www.crummy.com/software/BeautifulSoup/bs4/doc/#css-selectors
Edit: Here is a snippet of code since the documentation has a typo and it ommits the ":" in the selector string.
from bs4 import BeautifulSoup
data = "<!DOCTYPE html> <html> <body><p>One</p><p>Two</p><p>Three</p></body></html>"
soup = BeautifulSoup(data, 'html.parser')
print soup.body.select("p:nth-of-type(2)")
Upvotes: 1
Reputation: 474191
BeautifulSoup
would make it quite close to what you are asking about:
from bs4 import BeautifulSoup
soup = BeautifulSoup(data)
print(soup.html.body("p")[1].text) # prints Two
In other words, the dot here shortcuts to "find", the parenthesis shortcut to "find all".
Upvotes: 2