Reputation: 13
The most recent version of my working script has been included at the bottom of the post. I am looking into how to wiki this. **
Good day, I have the following code, I am wondering how to search the results for a match? I will be trying to match two to three words. I have tried html2text, beautifulsoup, re.search, and several others. Wether ive not implemented the things ive tried correctly, or they just dont work.
import requests
s = requests.session()
url = 'http://company.name.com/donor/index.php'
values = {'username': '1234567',
'password': '7654321'}
r = s.post(url, data=values)
# page which requires being logged in to view
url = "http://company.name.com/donor/donor.php"
# sending cookies as well
result = s.get(url)
Ive tried many different ways, just cant get it. I am wondering which module I will need to be working with? And will i need to change the form of data that "result" is in? One thing I havent tried is writing "result" to a text file. I guess I could do that, and then search for my matches in that file... Im just thinking there is a very simple way to do this.
thanks for any help or direction
Updated/Edited Script:
## Script will, login, navigate to correct page, search and match, then print and text/sms result.
import re
import urllib
import smtplib
import requests
from bs4 import BeautifulSoup
s = requests.session()
url = 'http://company.name.com/donor/index.php'
values = {'username': '123456',
'password': '654321'}
r = s.post(url, data=values)
# Now you have logged in
url = "http://company.name.com/donor/donor.php"
# sending cookies as well
result = s.get(url)
print (result.headers)
print (result.text)
result2 = (result.text)
match1 = re.findall('FindMe', result2); #we are trying to find "FindMe" in "result2"
if len(match1) == 1: #if we find a match
matchresult = ('Yes it matched')
print (matchresult)
else: #if we don't find a match
matchresult = ('Houston we have a problem')
print (matchresult)
# send text from gmail account portion of code starts here.
body = matchresult
body = "" + body + ""
headers = ["From: " + 'Senders Name',
"Subject: " + 'Type Subject Information',
"To: " + '[email protected]', #phone number and cell carrier @address
"MIME-Version: 1.0",
"Content-Type: text/html"]
headers = "\r\n".join(headers)
session = smtplib.SMTP('smtp.gmail.com', '587')
session.ehlo()
session.starttls()
session.ehlo
session.login('[email protected]', 'passwordforemailaddress')
session.sendmail('senders name', '[email protected]', headers + "\r\n\r\n" + body)
session.quit()
Upvotes: 0
Views: 4335
Reputation: 82899
Still not sure whether I understood the question correctly, but based on the additional information from your comment, it should suffice to do something like this:
import urllib2
page = urllib2.urlopen("http://your.url.com")
content = page.read()
if "congratulations" in content:
print ...
if "We're sorry" in content:
print ...
As you are looking for very specific words, there is no need to use regular expressions to match some more general pattern, or a HTML parser to look into the structure of the document. Just see whether the string is in
the document.
Upvotes: 1