Reputation: 37
I've been searching for an answer to this question for almost an hour now, so I thought I would finally ask. I know you can use \p{L}
to match any kind of letter from any language, but I haven't encountered any way to match a letter from any language that is the equivalent of some English letter, like the letter a.
For example, ideally I'd want to match any equivalent of "a" or "A" in any language (like: Å, å, Ǻ, ǻ, Ḁ, ḁ, ẚ, Ă, etc...)
Upvotes: 2
Views: 56
Reputation: 8576
Your best shot is something like:
myStr.normalize("NFD").replace(/\p{Diacritic}/gu, "")
It uses normalize()
, and needs ES6. For example:
myStr = "ÁàÀăĂắẮâÂåÅǻǺäÄãÃąĄāĀȃȂḁḀćčçÇéÉèÈêÊěëËḝęēĒȇȆíÍìÌîÎïÏīĪȋȊľńñÑóÓòÒôÔöÖõÕōŌȏȎŘȓȒśšŠşŞșȘţŢțȚúÚùÙûÛůŮüÜűũŨūŪȗȖẘýÝẙÿŸȳȲźžŽż"
myStr.normalize("NFD").replace(/\p{Diacritic}/gu, "")
// "AaAaAaAaAaAaAaAaAaAaAaAaAcccCeEeEeEeeEeeeEeEiIiIiIiIiIiIlnnNoOoOoOoOoOoOoORrRssSsSsStTtTuUuUuUuUuUuuUuUuUwyYyyYyYzzZz"
It works even for accented non-latin letters like "ᾰᾸӑӐёӂӁйЙ"
and weirder things, like many letters sharing accent b͝g
.
On what it does not work? Well, for things like ẚ
(LATIN SMALL LETTER A WITH RIGHT HALF RING
).
The reason? Its compatibility decomposition uses U+02BE ʾ
(MODIFIER LETTER RIGHT HALF RING
), which is outside the range of U+0300-U+036F that p{Diacritic}
uses. (Check here for reference)
Upvotes: 2