Ivo
Ivo

Reputation: 23357

How do I unescape multiple byte character utf8

I want to unescape "Sch%C3%B6ne". I found this unescape function online that works in a lot of cases but not this one because it's 2 characters for one, I tested the following code on http://www.lua.org/cgi-bin/demo

teststring = "Sch%C3%B6ne"

function unescape (str)
        str = string.gsub (str, "+", " ")
        str = string.gsub (str, "%%(%x%x)", function(h) return string.char(tonumber(h,16)) end)
        str = string.gsub (str, "\r\n", "\n")
        return str
end

print(unescape(teststring))

It prints Schöne but I want Schöne. Any one can help me?

Upvotes: 3

Views: 264

Answers (1)

Yu Hao
Yu Hao

Reputation: 122493

The method works fine, it's the online Lua interpreter that doesn't show correct result in this UTF8 example.

You can test it under another interpreter, e.g, this one.

Upvotes: 2

Related Questions