cit
cit

Reputation: 2605

Python: using a regular expression to match one line of HTML

This simple Python method I put together just checks to see if Tomcat is running on one of our servers.

import urllib2
import re
import sys

def tomcat_check():

    tomcat_status = urllib2.urlopen('http://10.1.1.20:7880')
    results = tomcat_status.read()
    pattern = re.compile('<body>Tomcat is running...</body>',re.M|re.DOTALL)
    q = pattern.search(results)
    if q == []:
        notify_us()
    else:
         print ("Tomcat appears to be running")
    sys.exit()

If this line is not found :

<body>Tomcat is running...</body>

It calls :

notify_us()

Which uses SMTP to send an email message to myself and another admin that Tomcat is no longer runnning on the server...

I have not used the re module in Python before...so I am assuming there is a better way to do this... I am also open to a more graceful solution with Beautiful Soup ... but haven't used that either..

Just trying to keep this as simple as possible...

Upvotes: 1

Views: 336

Answers (5)

tux21b
tux21b

Reputation: 94699

As you have mentioned, regular expressions aren't suited for parsing XML like structures (at least, for more complex queries). I would do something like that:

from lxml import etree
import urllib2

def tomcat_check(host='127.0.0.1', port=7880):
    response = urllib2.urlopen('http://%s:%d' % (host, port))
    html = etree.HTML(response.read())
    return html.findtext('.//body') == 'Tomcat is running...'

if tomcat_check('10.1.1.20'):
    print 'Tomcat is running...'
else:
    # notify someone

Upvotes: 0

Nick Presta
Nick Presta

Reputation: 28675

There are lots of different methods:

str.find()

if results.find("Tomcat is running...") != -1:
    print "Tomcat appears to be running"
else:
    notify_us()

Using X in Y

if "Tomcat is running..." in result:
    print "Tomcat appears to be running"
else:
    notify_us()

Using Regular Expressions

if re.search(r"Tomcat is running\.\.\.", result):
    print "Tomcat appears to be running"
else:
    notify_us()

Personally, I prefer the membership operator to test if the string is in another string.

Upvotes: 1

ghostdog74
ghostdog74

Reputation: 342433

if not 'Tomcat is running' in results:
    notify_us()

Upvotes: 2

msw
msw

Reputation: 43497

Since you appear to be looking for a fixed string (not a regexp) that you have some control over and can be expected to be consistent, str.find() should do just fine. Or what Daniel said.

Upvotes: 0

Daniel Roseman
Daniel Roseman

Reputation: 599630

Why use regex here at all? Why not just a simple string search?:

if not '<body>Tomcat is running...</body>' in results:
   notify_us()

Upvotes: 8

Related Questions