Reputation: 5993
What a quick web search will confirm that US ASCII is a subset of UTF-8, but what I've not yet found is how to convert &foo; and { to their corresponding native UTF-8 characters.
I know that at least 7-bit US ASCII is unchanged in UTF-8, but I haven't seen yet a program to filter through and convert &foo; to how it would naturally be expressed in UTF-8.
Upvotes: 0
Views: 293
Reputation: 94
You can use html_entity_decode(s, "UTF-8")
in PHP or html.unescape(s)
in Python.
Upvotes: 1