How to Trace Particular Content in Web Page using Python in Certain Time Period?

Question

I Want to Monitor Some Content Changes Which is Present in Some Web Pages. i Want to do the Same in Daily Basis using any Scripting or Browser plugin itself....

for example, I Want to Wet Notified if Some Changes Happened in Particular Content at Some Web Pages Based On My Query Without Subscribing their Subscription.

I want to get notified when it matches my criteria on daily basis..
Is their any script or browser plugin available for that?
Can i achieve this using python script to trace changes available...
How can I achieve this?

Mohideen bin Mohammed · Accepted Answer

Here is my code, how i scrap a table from one site. in that site, they didn't define id or class in table so you no need to put anything. if id or class there means just use html.xpath('//table[@id=id_val]/tr') instead of html.xpath('//table/tr')

import time
from lxml import etree
import urllib
while True:
    time.sleep(60) # for 1 minute time interval
    #time.sleep(86400) # for 1 day time interval
    web = urllib.urlopen("http://www.yoursite.com/")
    html = etree.HTML(web.read())
    tr_nodes = html.xpath('//table/tr')
    td_content = [tr.xpath('td') for tr in tr_nodes  if [td.text for td in tr.xpath('td')][2] == 'Chennai' or [td.text for td in tr.xpath('td')][2] == 'Across India'  or 'Chennai' in [td.text for td in tr.xpath('td')][2].split('/') ]
    main_list = []
    for i in td_content:
        if i[5].text == 'Freshers' or  'Freshers' in i[5].text.split('/') or  '0' in i[5].text.split(' '):
            sub_list = [td.text for td in i]
            sub_list.insert(6,'http://yoursite.com/%s'%i[6].xpath('a')[0].get('href'))
            main_list.append(sub_list)
    print 'main_list',main_list

How to Trace Particular Content in Web Page using Python in Certain Time Period?

Answers (2)

Related Questions