Stéphane de Luca
Stéphane de Luca

Reputation: 13603

Swift: how to replace \U literal encoding into plain UNICODE characters?

Say I have the following String originated from a server :

let uLiteralEncoded = "Derri\U00e8re le transfert d'Anthony Martial"

I'd like to replace it by the String as follows:

var plainEncoded = "Derrière le transfert d'Anthony Martial"

Upvotes: 4

Views: 2681

Answers (1)

Stéphane de Luca
Stéphane de Luca

Reputation: 13603

With further trials, I found the solution finally.

The format is an HTML format with HTML entities (hence the quote as ' and the diacritics with \U coding).

then I wrote a String extension that build a standard Swift 4 String from it as a constructor as follows:

extension String {

    /// String as HTML
    init(htmlEncodedString: String) {
        let encodedData = htmlEncodedString.data(using: String.Encoding.utf8)!
        let attributedOptions : [NSAttributedString.DocumentReadingOptionKey: Any] = [
            NSAttributedString.DocumentReadingOptionKey.documentType : NSAttributedString.DocumentType.html,
            NSAttributedString.DocumentReadingOptionKey.characterEncoding: String.Encoding.utf8.rawValue
        ]
        do {
            let attributedString = try NSAttributedString(data: encodedData, options: attributedOptions, documentAttributes: nil)

            self.init(attributedString.string)
        }
        catch {
            self.init(htmlEncodedString)    // Something gone wrong, stick with the initial string
        }
    }
}

Upvotes: 4

Related Questions