Why does xml retrieved from a site not look like web browser content?

Question

I've been trying to fetch the xml data found here: http://www.thetvdb.com/api/D1BD82E2AE599ADD/mirrors.xml

You'll notice that the xml data is easily read in your web browser. When I try to load it using urllib2, however, the following problem occurs. (Based on the tutorial found at http://www.doughellmann.com/PyMOTW/urllib2/):

import urllib2
response = urllib2.urlopen('http://www.thetvdb.com/api/D1BD82E2AE599ADD/mirrors.xml')

print response.read()

Output:

'

  
    1
    http://thetvdb.com
    7
  

'

I have tried with other websites (e.g.: python.org) and it seems to work. The problem seems to be library independent (I've had the same problem with urllib, httplib, httplib2, ...) and the problem seems to be specific to the site I'm trying to fetch.

What gives?

EDIT: okay, it seems as though I was confused as to what I "should" be seeing. Out of curiosity, does anybody know what the "script" section is? I'm viewing the page using google chrome (stable).

user177800 · Accepted Answer

"It looks nothing like the data that is shown if the page is loaded in a web browser. I'm updating the question with this information.."

When I get that example URL with Chrome I get exactly what you are getting with your Python code, the raw data.

Your browser is auto-magically detecting the XML and formatting it as HTML. It is the the "exact same" as what Python is getting, which is the raw data. The browser is confusing you to what you should be expecting.

NOTE: don't trust what you see or is reported with the Developer Tools information, it shows you the HTML which is in this case a generated wrapper around the output that Chrome is magically generating to enable the interactive display of the XML with code folding ( JavaScript ) and all that other bling, and not what the server is actually sending you, which is what you should see when you use View Source.

Why does xml retrieved from a site not look like web browser content?

Answers (2)

Related Questions