sam
sam

Reputation: 19164

error while parsing url using python

I am working on a url using python.
If I click the url, I am able to get the excel file.
but If I run following code, it gives me weird output.

>>> import urllib2
>>> urllib2.urlopen('http://intranet.stats.gov.my/trade/download.php?id=4&var=2012/2012%20MALAYSIA%27S%20EXPORTS%20BY%20ECONOMIC%20GROUPING.xls').read()

output :

"<script language=javascript>window.location='2012/2012 MALAYSIA\\'S EXPORTS BY ECONOMIC GROUPING.xls'</script>"

why its not able to read content with urllib2?

Upvotes: 1

Views: 302

Answers (2)

Nolen Royalty
Nolen Royalty

Reputation: 18633

@Kai in this question seems to have found an answer to javascript redirects using the module Selenium

from selenium import webdriver

driver = webdriver.Firefox()
link = "http://yourlink.com"
driver.get(link)

#this waits for the new page to load
while(link == driver.current_url):
  time.sleep(1)

redirected_url = driver.current_url

Upvotes: 1

itayw
itayw

Reputation: 614

Take a look using an http listener (or even Google Chrome Developer Tools), there's a redirect using javascript when you get to the page.

You will need to access the initial url, parse the result and fetch again the actual url.

Upvotes: 1

Related Questions