Reputation: 85
[http://www.canadapost.ca/cpotools/apps/track/personal/findByTrackNumber?execution=e1s1]
for example Tracking Numbers :LM920347139CN,
i want to extract the track history data,but it using redirect .
so how to figure out, it will be better if any ways to get data not containing presentation logic
Upvotes: 0
Views: 808
Reputation: 87114
EDIT
Apparently there are REST and SOAP APIs available for tracking. See http://www.canadapost.ca/cpo/mc/business/productsservices/developers/services/tracking/default.jsf
The easiest (non-API) way is probably to use the mechanize module which you can get from PyPI. You use it like a web browser. It will follow the redirect for you and manage any cookies as required by this particular web site. Example:
import mechanize
br = mechanize.Browser()
url = 'http://www.canadapost.ca/cpotools/apps/track/personal/findByTrackNumber'
response = br.open(url)
br.select_form('tapByTrackSearch:trackSearch')
br.form['tapByTrackSearch:trackSearch:trackNumbers'] = 'LM920347139CN'
response = br.submit()
html = response.read()
If you prefer to use requests
, or if you need to support Python 3, requests
will also follow redirects and manage cookies as required
import requests
s = requests.Session()
url = 'http://www.canadapost.ca/cpotools/apps/track/personal/findByTrackNumber'
response = s.get(url)
With requests
, however, you will need to set up the required POST form fields (which I do not show here).
Once you have the HTML you can use a HTML parser such as BeautifulSoup to process and extract the required data.
from bs4 import BeautifulSoup
soup = BeautifulSoup(html)
tracking_table = soup.find(id='tapListResultForm:table_2')
.
.
.
from which you can extract the tracking data.
Upvotes: 1