Reputation: 279
Ok.This is my first question here So I'm trying to make this program which searches for the phrase "You have an error" in the HTML source code.The problem is when I try
html_data=urllib2.open(site).read()
if html_data.find(string):
print "It's found"
It doesn't find it..Although when I print html_data it is found in there with no tags whatsoever. Can anybody help me on this?
Upvotes: 2
Views: 1071
Reputation: 925
Do the upper/lower cases match the page you are looking at? Would you be able to give us the page you are trying to read this from? Because this code seems to work fine:
>>> string = 'You have an error'
>>> page = """
You have an error
"""
>>> if string in page:
print "It's found"
It's found
Upvotes: 1
Reputation: 500327
str.find()
returns the index (or -1 if not found). Thus the following is incorrect:
if html_data.find(string):
It should be:
if html_data.find(string) != -1:
Alternatively, if you don't need to know the position of the match:
if string in html_data:
Upvotes: 2
Reputation: 66
find
method returns -1 if it doesn't find the string, not 0. So, you should use it like
if html_data.find(string) != -1:
Upvotes: 0
Reputation: 3808
Sometimes code is generated dynamically upon javascript loading and execution. In that case you will need to execute the JavaScript to get exactly the same page source as you get from a browser's View Source. You might want to write a browser extension for this, that then if required sends what it finds to your python server. The advantage of that is you get to use a browser's JavaScript vm.
Upvotes: 0