SIM
SIM

Reputation: 22440

Facing urlencoding issue while processing request

I've written a script in python to scrape some information from a webpage. The site requires get request method. The issue I'm facing at this moment is that as the parameters is needed to be merged with url so it should properly be urlencoded. This is where I'm stuck. I can't properly encode it to get a valid response. I gave a try but it doesn't bring any

The script I was trying with:

import requests
import urllib.parse

fields ={
'/API/api/v1/Search/Properties/?f':'319 lizzie','ty':'2018','pvty':'2017','pn':'1','st':'9','so':'1','pt':'RP;PP;MH;NR','take':'20','skip':'0','page':'1','pageSize':'20'
}
payload = urllib.parse.quote_plus(fields, safe='', encoding=None, errors=None)

headers={
"User-Agent":"Mozilla/5.0"
}

page = requests.get("http://search.wcad.org/Proxy/APIProxy.ashx?", params=payload, headers=headers)
print(page.json())

The above URL should look like this:

http://search.wcad.org/Proxy/APIProxy.ashx?/API/api/v1/Search/Properties/?f=319%20LIZZIE&ty=2018&pvty=2017&pn=1&st=9&so=1&pt=RP%3BPP%3BMH%3BNR&take=20&skip=0&page=1&pageSize=20

to get the response.

Btw, this is the error I'm having with my existing script:

Traceback (most recent call last):
  File "C:\Users\ar\AppData\Local\Programs\Python\Python35-32\Social.py", line 9, in <module>
    payload = urllib.parse.quote_plus(fields, safe='', encoding=None, errors=None)
  File "C:\Users\ar\AppData\Local\Programs\Python\Python35-32\lib\urllib\parse.py", line 728, in quote_plus
    string = quote(string, safe + space, encoding, errors)
  File "C:\Users\ar\AppData\Local\Programs\Python\Python35-32\lib\urllib\parse.py", line 712, in quote
    return quote_from_bytes(string, safe)
  File "C:\Users\ar\AppData\Local\Programs\Python\Python35-32\lib\urllib\parse.py", line 737, in quote_from_bytes
    raise TypeError("quote_from_bytes() expected bytes")
TypeError: quote_from_bytes() expected bytes

Upvotes: 0

Views: 351

Answers (1)

Tomalak
Tomalak

Reputation: 338228

This works. As the documentation indicates, there is no need to do any URL encoding yourself.

The point is that the query string begins at the last question mark, not at the first. Including the second question mark in the URL is mandatory, as requests does only adds one when there isn't one there already.

import requests

url = "http://search.wcad.org/Proxy/APIProxy.ashx?/API/api/v1/Search/Properties/?"
params = {'f':'319 lizzie','ty':'2018','pvty':'2017','pn':'1','st':'9','so':'1','pt':'RP;PP;MH;NR','take':'20','skip':'0','page':'1','pageSize':'20'}

response = requests.get(url, params)

response.json()

results in

{
    'ResultList': [{
        'PropertyQuickRefID': 'R016698',
        'PartyQuickRefID': 'O0485204',
        'OwnerQuickRefID': 'R016698',
        'LegacyID': None,
        'PropertyNumber': 'R-13-0410-0620-50000',
        'OwnerName': 'GOOCH, PHILIP L',
        'SitusAddress': '319 LIZZIE ST, TAYLOR, TX  76574',
        'PropertyValue': 46785.0,
        'LegalDescription': 'DOAK ADDITION, BLOCK 62, LOT 5',
        'NeighborhoodCode': 'T541',
        'Abstract': None,
        'Subdivision': 'S3564 - Doak Addition',
        'PropertyType': 'Real',
        'ID': 0,
        'Text': None,
        'TaxYear': 2018,
        'PropertyValueTaxYear': 2017
    }],
    'HasMoreData': False,
    'TotalPageCount': 1,
    'CurrentPage': 1,
    'RecordCount': 1,
    'SearchText': '319 lizzie',
    'PagingHandledByCaller': False,
    'TaxYear': 2018,
    'PropertyValueTaxYear': 0
}

Upvotes: 1

Related Questions