user3064538
user3064538

Reputation:

Replace all unmatched surrogate pairs with replacement character in JavaScript string

I have a JavaScript string that I'm writing to a file. I need to replace any unmatched surrogate pairs with the replacement character. Is there some regex character class that only matches unpaired surrogates or do I have to do some additional processing?

Upvotes: 2

Views: 620

Answers (2)

user3064538
user3064538

Reputation:

function toWellFormed(s) {
  return s.replace(/\p{Surrogate}/gu, '\uFFFD')
}
toWellFormed('foo 𝌆')                  // 'foo 𝌆'
toWellFormed('foo \uD834\uDF06')       // 'foo 𝌆'
toWellFormed('foo \uD834')             // 'foo �'
toWellFormed('foo \uDF06\uDF06\uDF06') // 'foo ���'

Upvotes: 1

String.prototype.toWellFormed() replaces any lone surrogates with the Unicode replacement character U+FFFD .

Upvotes: 3

Related Questions