Reputation: 962
I'm seeking the correct sequence of Python decode()
and encode()
functions required to result in completely clean and parse-able JSON, no matter which combinations of unusual ASCII/unicode/emoji characters present.
In Javascript, I'm trying to JSON.parse()
some API data returned from my Python backend. My first hurdle was this entry:
🔥My name is Miguel Ive been in the flooring industry since 2014, on 2018 I started my own flooring store and our priority is provide the best customer service to our clients ! Visit us on our new showroom on woods cross we will be happy to help you with your project and we will give you a free in home consultation and ask about our 10%discount on same day booking ! 🔥
I successfully used this answer to decode such characters on the Python backend:
value.encode('cp1252','backslashreplace').decode('utf-8','backslashreplace')
Result:
🔥My name is Miguel Ive been in the flooring industry since 2014, on 2018 I started my own flooring store and our priority is provide the best customer service to our clients ! Visit us on our new showroom on woods cross we will be happy to help you with your project and we will give you a free in home consultation and ask about our 10%discount on same day booking ! 🔥
But then I ran into new API calls with JSON that JSON.parse()
threw errors for. For example, these \x92
characters were preventing the JSON from correctly parsing ("Bad escaped character
"):
Hello, I\x92m Eric, I\x92ve been around all phases of construction since I was old enough to hold a hammer. Was taught by my father how to do it right the first time, and make it so people can afford to do more with their budget. I can do it all, but enjoy doing flooring and tile work. I\x92m not afraid of a creative challenge
So I reluctantly (since I knew it wouldn't be a reliable solution) added a simple replace()
statement:
decoded_text.replace('\\x92', "’")
But I (expectedly) soon ran into new API calls that returned data that JSON.parse()
refused to work with, like \u011f\x9f\x91\x8c\u011f\x9f\ufffd\xbd
in the string below:
Went and above and beyond to expose and fix other poorly done installations \u011f\x9f\x91\x8c\u011f\x9f\ufffd\xbd
Here is my full Python code for calling the API, and my attempt at properly decoding it:
# models/api_connector.py
from odoo import models, fields, api
import requests
import base64
import configparser
import os
class MyAPIConnector(models.Model):
_name = 'api.connector'
_description = 'API Connector'
name = fields.Char('Name')
def _generate_auth_header(self):
# Read configuration
config = configparser.ConfigParser()
config_path = os.path.join(os.path.dirname(__file__), '..', 'config.ini')
config.read(config_path)
# Combine and encode credentials
org_name = config.get('api_credentials', 'org_name')
secret_key = config.get('api_credentials', 'secret_key')
combined_credentials = f"{org_name}:{secret_key}"
encoded_credentials = base64.b64encode(combined_credentials.encode()).decode()
return f"Basic {encoded_credentials}"
def fetch_data(self, zip_code):
try:
headers = {'Authorization': self._generate_auth_header()}
category_pk = '123456789'
API_ENDPOINT = f"https://api.myapi.com/v1/partners/discoverylite/pros?zip_code={zip_code}&category_pk={category_pk}&utm_source=myCompany"
response = requests.get(API_ENDPOINT, headers=headers)
response_text = response.text
# Decoding and re-encoding the response text
decoded_text = response_text.encode('cp1252', 'backslashreplace').decode('utf-8', 'backslashreplace')
decoded_text = decoded_text.replace('\\x92', "’")
return decoded_text
except Exception as e:
_logger.error(f"Error: {e}")
Here is the raw, untouched data returned from the API for the latest problematic entry (which JSON.parse()
parses just fine but contains encoded characters):
{
"results": [{
"service_id": "490803031657299972",
"business_name": "Jay Contractors LLC",
"rating": 4.972222222222222,
"num_reviews": 36,
"years_in_business": 1,
"num_hires": 87,
"thumbtack_url": "https://www.thumbtack.com/pa/lansdowne/handyman/jay-contractors-llc/service/490803031657299972?category_pk=151436204227846419\u0026utm_medium=partnership\u0026utm_source=cma-hewn\u0026zip_code=19701",
"image_url": "https://production-next-images-cdn.thumbtack.com/i/490934068026425349/desktop/standard/thumb",
"background_image_url": "https://production-next-images-cdn.thumbtack.com/i/319588452476190788/small/standard/hero",
"featured_review": "Went and above and beyond to expose and fix other poorly done installations 👌�",
"quote": {
"starting_cost": 5000,
"cost_unit": "on-site estimate"
},
"introduction": "My focus is quality, and dedication to the job. I take pride in every job, from start to finish. Always go for 100% satisfaction by client and see them smiling after the work is completed.\n\nPlease keep in mind if your using the instant book feature, it means your hiring me for my services and if you decide to cancel later on there will be a charge.\n\nAccept Cash, Cashapp, Zelle, and Venmo.\nNo Checks please.\n\nLicensed and Insured",
"pills": ["popular", "licensed"],
"location": "Bear, DE",
"similar_jobs_done": 1,
"license_verified": true,
"num_of_employees": 2,
"has_background_check": true,
"iframe_url": ""
}]
}
And here is that same entry's JSON after being processed my Python's decoding functions, which JSON.parse()
refuses to parse (with the error SyntaxError: Bad escaped character in JSON at ... JSON.parse (<anonymous>)
):
{
"results": [{
"service_id": "490803031657299972",
"business_name": "Jay Contractors LLC",
"rating": 4.972222222222222,
"num_reviews": 36,
"years_in_business": 1,
"num_hires": 87,
"thumbtack_url": "https://www.thumbtack.com/pa/lansdowne/handyman/jay-contractors-llc/service/490803031657299972?category_pk=151436204227846419\u0026utm_medium=partnership\u0026utm_source=cma-hewn\u0026zip_code=19701",
"image_url": "https://production-next-images-cdn.thumbtack.com/i/490934068026425349/desktop/standard/thumb",
"background_image_url": "https://production-next-images-cdn.thumbtack.com/i/319588452476190788/small/standard/hero",
"featured_review": "Went and above and beyond to expose and fix other poorly done installations \u011f\x9f\x91\x8c\u011f\x9f\ufffd\xbd",
"quote": {
"starting_cost": 5000,
"cost_unit": "on-site estimate"
},
"introduction": "My focus is quality, and dedication to the job. I take pride in every job, from start to finish. Always go for 100% satisfaction by client and see them smiling after the work is completed.\n\nPlease keep in mind if your using the instant book feature, it means your hiring me for my services and if you decide to cancel later on there will be a charge.\n\nAccept Cash, Cashapp, Zelle, and Venmo.\nNo Checks please.\n\nLicensed and Insured",
"pills": ["popular", "licensed"],
"location": "Bear, DE",
"similar_jobs_done": 1,
"license_verified": true,
"num_of_employees": 2,
"has_background_check": true,
"iframe_url": ""
}]
}
Upvotes: 1
Views: 86