Sparky
Sparky

Reputation: 883

HTML Unescape is not unescaping special characters

My program does not unescape the HTML special characters for quotes and I can't figure out why. It still displays the special characters in the Terminal.

For example: 'In the comic book "Archie"

import requests
import html

API_URL = "https://opentdb.com/api.php"

parameters = {
    "amount": 10,
    "type": "boolean"
}

response = requests.get(API_URL, params=parameters)
data = html.unescape(response.json())
unescaped_data = data["results"]
print(f"UNESCAPED DATA: {unescaped_data}") # THIS IS NOT WORKING

Upvotes: 0

Views: 1954

Answers (2)

Jan Wilamowski
Jan Wilamowski

Reputation: 3599

The result isn't unescaped because response.json() returns a JSON object (i.e. a dict) and not a string. If you wanted to, you could unescape the response string using html.unescape(response.text) but this will leave you with invalid JSON, e.g.: "question":""Windows NT" is a monolithic kernel.", (note the additional quotes). So the escaping is there for a reason and you will have to unescape only those parts that you really need, that is, the string components of your JSON object.

Upvotes: 1

Sparky
Sparky

Reputation: 883

When I changed response.json() to response.text it works

data = html.unescape(response.text)

Upvotes: 0

Related Questions