andreSmol
andreSmol

Reputation: 1038

Python, the web page seems not to see ALL my DATA and does not provide the correct answer?

If I enter this URL I will get some XML formatted text as a response:

http://skrutten.nada.kth.se/scrut/svesve/?text=g%E5&url=&xmlout=on&x=Granska

Now I try to write a Python function that POST a message to the web page and I should get the same information. However, I am getting less information from the web page. It seems that some data are not "seen"(text) but I do not get any error messages. For instance, I am not sure if I am handling the coding in a correct way (I tried with utf-8 with no difference)

Any suggestions about my misstake?

def fetch_web (name, par1):
    """Fetch the web data defined by name and return a string with the web page"""

    if name == "granska":
        url = "http://skrutten.nada.kth.se/scrut/svesve/"
        values = {"text":par1,"xmlout":"on","x":"Granska","url":""}
        code = "ISO-8859-1"
    data = urllib.parse.urlencode(values)
    data = data.encode(code)

    req = urllib.request.Request(url,data)
    response = urllib.request.urlopen(req)

    page = response.read()
    return page.decode("ISO-8859-1")

print (fetch_web("granska","gå"))

Upvotes: 3

Views: 244

Answers (1)

chown
chown

Reputation: 52778

Try this

def fetch_web (name, par1):
    """Fetch the web data defined by name and return a string with the web page"""

    if name == "granska":
        url = "http://skrutten.nada.kth.se/scrut/svesve/"
        values = {"text":par1,"xmlout":"on","x":"Granska","url":""}
        code = "ISO-8859-1"

    data = urllib.parse.urlencode(values)
    data = data.encode(code)
    full_url = "%s?%s" % (url, data)

    req = urllib.request.Request(full_url)
    response = urllib.request.urlopen(req)

    page = response.read()
    return page.decode("ISO-8859-1")

print (fetch_web("granska","gå"))

There may be issues caused by the lack of a file name at the end of a URL. I've experienced this same thing in the past with several of my scrips where it doesn't add a "?" between the URL and post data when there is only a trailing "/" at the end of the URL. I think it's a bug with the urllib2.Request class, but since I found a workaround I have been putting off looking into if a bug report exists or not.

Also, if name != "granska" you are going to get an exception as url and values will be undefined variables when used in the data = ... lines.

Upvotes: 3

Related Questions