Santhucool
Santhucool

Reputation: 1706

Regular expression not working for at least one European character

I am checking whether my string contains at least one character of a European language(Example: German, Spanish,English etc...)

I tried like the following:

var check = "abc";

if(check.match(/^[a-zA-ZäöåÄÖÅ]+$/)){
               alert("if");
               }
   else{
   alert("else");
   }

It should only work if it is having at least a European language character. Should not work if numbers only. Please guide me guys!!

Upvotes: 2

Views: 6111

Answers (2)

joutsen
joutsen

Reputation: 56

Just in case you're tolerant to weird characters and want to make sure you don't miss any very rare European-looking characters, here's a compact and wide/loose regular expression for you:

[a-zA-ZÀ-ʸᴀ-ᶿḀ-ỿⅠ-ⅿⱠ-ⱿꜢ-ꟊꬰ-ꭥA-Za-zЀ-ӿͰ-Ͽἀ-῾\u0300-\u036f]

or

[a-zA-Z\u00c0-\u02b8\u1d00-\u1dbf\u1e00-\u1eff\u2160-\u217f\u2c60-\u2c7f\ua722-\ua7ca\uab30-\uab65\uff21-\uff3a\uff41-\uff5a\u0400-\u04ff\u0370-\u03ff\u1f00-\u1ffe\u0300-\u036f]

This includes Latin script, Cyrillic, Greek, Greek extended and Combining Diacritical Marks

Tested against List of latin script alphabets

ÆⱭꞴÐƎƏƐƔIƖŊŒƆꞶƱKẞƩÞƲǷȜƷʔ æɑꞵðǝəɛɣıɩŋœɔꞷʊĸßʃþʋƿȝʒʔ ĄA̧Ą̊ƁƇÇĐƊƉĘȨƏ̧Ɛ̧ƑǤƓĦꞪĮI̧ƗƗ̧ƘŁM̧ƝǪO̧ØƠƆ̧ƤɌŞƬŢŦŲU̧ƯɄY̨Ƴ ąa̧ą̊ɓƈçđɗɖęȩə̧ɛ̧ƒǥɠħɦįi̧ɨɨ̧ƙłm̧ɲǫo̧øơɔ̧ƥɍşƭţŧųu̧ưʉy̨ƴ ÁÀȦÂÄǞǍĂĀÃÅǺǼǢḄĆĊĈČĎḌḐḒÉÈĖÊËĚĔĒẼE̊ẸǴĠĜǦĞG̃ĢĤḤ áàȧâäǟǎăāãåǻǽǣḅćċĉčďḍḑḓéèėêëěĕēẽe̊ẹǵġĝǧğg̃ģĥḥ ÍÌİÎÏǏĬĪĨỊĴĶǨĹĻĽĿḶḼM̂M̄NŃN̂ṄN̈ŇN̄ÑŅṊÓÒȮȰÔÖȪǑŎŌÕȬŐỌǾƠ íìiîïǐĭīĩịĵķǩĺļľŀḷḽm̂m̄ʼnńn̂ṅn̈ňn̄ñņṋóòȯȱôöȫǒŏōõȭőọǿơ P̄ŔŘŖṚŚŜṠŠȘṢŤȚṬṰÚÙÛÜǓŬŪŨŰŮỤẂẀŴẄẊÝỲŶŸȲỸŹŻŽẒǮ p̄ŕřŗṛśŝṡšșṣťțṭṱúùûüǔŭūũűůụẃẁŵẅẋýỳŷÿȳỹźżžẓǯ

I use it to check usernames on an academic forum to make sure people can write their native names as long as they are at least somewhat internationally readable.

Upvotes: 0

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626845

You just need to remove the anchors and the quantifier and use test:

alert(/(?![×÷])[A-Za-zÀ-ÿ]/.test("ß1111"))
alert(/(?![×÷])[A-Za-zÀ-ÿ]/.test("ö"))
alert(/(?![×÷])[A-Za-zÀ-ÿ]/.test("12345"))

The (?![×÷])[A-Za-zÀ-ÿ] regex is an adaptation of the regex provided in Useful ASCII Ranges. It will capture all Latin and accented characters.

Some more language-related character ranges you can use:

French Letters: [a-zA-ZàâäôéèëêïîçùûüÿæœÀÂÄÔÉÈËÊÏΟÇÙÛÜÆŒ]

German Letters: [a-zA-ZäöüßÄÖÜ]

Polish Letters only: [a-pr-uwy-zA-PR-UWY-ZąćęłńóśźżĄĆĘŁŃÓŚŹŻ] (Note that there is no Q, V and X in Polish, but if you want to allow all English letters as well, use [a-zA-ZąćęłńóśźżĄĆĘŁŃÓŚŹŻ])

Italian Letters: [a-zA-ZàèéìíîòóùúÀÈÉÌÍÎÒÓÙÚ]

Spanish Letters: [a-zA-ZáéíñóúüÁÉÍÑÓÚÜ]

And some more...

Swedish: [a-zA-ZäöåÄÖÅ] (link)

Norwegian: [a-zA-ZæøåÆØÅ] (link)

Danish (same as Norwegian): [a-zA-ZæøåÆØÅ] (link)

Greek & Coptic + Greek Extended: [\u0370-\u03FF\u1F00-\u1FFF] (link)

Russian: [а-яА-ЯёЁ] (link)

Ukrainian: [а-щА-ЩЬьЮюЯяЇїІіЄєҐґ] (link)

Serbian (Cyrillic): [А-ИК-ШЂЈ-ЋЏа-ик-шђј-ћџ] (link)

Bulgarian (subset of Russian alphabet): [а-ъьюяА-ЪЬЮЯ] (link)

Belarusian script range: [ёа-зй-шы-яЁА-ЗЙ-ШЫІіЎў] (link)

Romanian: [a-zA-ZĂÂÎȘȚăâîșț] (link)

Upvotes: 18

Related Questions