How to fix complicated HTML encoding for URL in python script?

Question

I have a nightmare situation on my hands (or maybe it is easy, I don't know)...So I have a small function that runs in a rather large python script...I have everything worked out in the larger script, and at the end the script will call our web map services and show the parcels in question...We have 20K parcels, and ONLY 10 of them have a '%' in the Deedholder name. So this works over 99% of the time, but there is always that 1% (or way less in this case)

The problem is, in the rare situation where there is a percent sign in the deedholder name, when I supply a url it cannot find the query. So I have tested a ton of names, and it only will not work when there is a percent sign in the name.

So the prefix will look like this:

'https://cedar.integritygis.com/default.aspx?ql=Parcel&qf=REALDATA_DEEDHOLDER&qv='

and the name is added to the end, which looks like this:

'COOPER MICHAEL A & DEBRA K'

My code can easily replace the spaces with '%20' and the & with '%26'...etc. But what do I do when THIS is the deedholder name:

'SIEBELS LAWRENCE J (75%) & LOUISE F TRUST (25%)'

I cannot successfully get this query to work. Here is my test code with just the function in question:

import webbrowser, time

def FixURL(string):

##    string = string.replace('%','~')
    print string
    fix_dict = {' ':'%20','!':'%21','"':'%22','#':'%23','$':'%24',
                '&':'%26',"'":'%27','(':'%28',')':'%29',
                '*':'%2A','+':'%2b','.':'%2E','/':'%2F',':':'%3A',
                ';':'%3B','?':'%3F','@':'%40','{':'%7B','{':'%7D'}

    for k,v in fix_dict.iteritems():
        if k in string:
            string = string.replace(k,v)
##    return string.replace('~','%25')
    return string

if __name__ == '__main__':

    # testing
    easy = FixURL('COOPER MICHAEL A & DEBRA K')
    prefix = 'https://cedar.integritygis.com/default.aspx?ql=Parcel&qf=REALDATA_DEEDHOLDER&qv='
    url = '{}{}'.format(prefix,easy)
    print easy
    webbrowser.open(url)
    time.sleep(15)  # give it time to work

    hard = FixURL('SIEBELS LAWRENCE J (75%) & LOUISE F TRUST (25%)')
    print hard
    url = '{}{}'.format(prefix,hard)
    webbrowser.open(url)

I cannot figure out how to 'trick' it...You can see my unsucessful attempts are commented out. Does anyone have a fix? One thing I am thinking of doing is removing the space from the dictionary and using '%20'.join(string.split()) and testing each item in the list for replacement values for the url...Does have any ideas? It seems I have been squeezed by Python yet again. Thanks.

EDIT:

I have since scratched the entire function and am just urllib.quote(). this as a test:

import webbrowser, urllib, time

prefix = 'https://cedar.integritygis.com/default.aspx?ql=Parcel&qf=REALDATA_DEEDHOLDER&qv='
easy = urllib.quote('COOPER MICHAEL A & DEBRA K')
url = '{}{}'.format(prefix,easy)
print easy
webbrowser.open(url)
time.sleep(15)  # give it time to work

hard = urllib.quote('SIEBELS LAWRENCE J (75%) & LOUISE F TRUST (25%)')
print hard
url = '{}{}'.format(prefix,hard)
webbrowser.open(url)

This is suppposed to zoom to the parcels owned by the name supplied...The first one works, the second does not because of the % in the parenthesis (I think). I get the 'ol query returned no results error.

arghbleargh · Accepted Answer

You can use python's standard urllib to do this.

http://docs.python.org/2/library/urllib.html#utility-functions

Look at the utility functions. urllib.quote will probably do the job.

How to fix complicated HTML encoding for URL in python script?

Answers (1)

Related Questions