climboid
climboid

Reputation: 6952

find the end of the body tag in an html file with python

Hi I have the following code

inex = "app/index.html" 
original = open(index,"r")
for line in original:
    if line =='</body>':
        print "here"
original.close()

but It doesn't seem to find the line '. Do I have to strip out potential white space even though the index.html file has none? Any clues on how to find the tag?
Thanks

Upvotes: 1

Views: 197

Answers (2)

Don Question
Don Question

Reputation: 11614

Or you may try:

if '</body>' in line:

Upvotes: 1

Doug T.
Doug T.

Reputation: 65599

Right now you require that the line be exactly "</body>", no whitespace. Also valid HTML could have other stuff before body as html just treats line endings as whitespace, ie you could have foo</body>

The most direct way to solve your problem is to simply read the files contents into a string and then call find on that string

allText = original.read()
location = allText.find("</body>")

There's also numerous HTML parsers out there that can do this work for you.

Upvotes: 0

Related Questions