Roberts Sensters
Roberts Sensters

Reputation: 11

Javascript Regex - Non Latin Characters and Whitespace in between them

I need regex to validate Firstname and Lastname fields.

People can have 2 names for example so it should be able to handle multiple non latin words.

([^\x00-\x7F]|\w)+

This is what I have for validatating latin+non-latin characters, but it doesn't support multiple words.

If I enter: JĀNIS BĀNIS for example, it doesnt work!

Upvotes: 1

Views: 582

Answers (1)

user12097764
user12097764

Reputation:

If query the UCD database for LATIN specific properties that are letters and numbers
using this regex ( \w ) for Latin:

[\p{Block=Basic_Latin}\p{Block=Latin_1_Supplement}\p{Block=Latin_Extended_A}\p{Block=Latin_Extended_Additional}\p{Block=Latin_Extended_B}\p{Block=Latin_Extended_C}\p{Block=Latin_Extended_D}\p{Block=Latin_Extended_E}\p{Script=Latin}\p{Script_Extensions=Latin}](?<=\w)

yields this JavaScript usable class :

[0-9A-Z_a-z\u00AA\u00B5\u00BA\u00C0-\u00D6\u00D8-\u00F6\u00F8-\u02B8\u02E0-\u02E4\u0363-\u036F\u0485-\u0486\u0951-\u0952\u1D00-\u1D25\u1D2C-\u1D5C\u1D62-\u1D65\u1D6B-\u1D77\u1D79-\u1DBE\u1E00-\u1EFF\u2071\u207F\u2090-\u209C\u20F0\u212A-\u212B\u2132\u214E\u2183-\u2184\u2C60-\u2C7F\uA722-\uA788\uA78B-\uA7BF\uA7C2-\uA7C6\uA7F7-\uA7FF\uAB30-\uAB5A\uAB5C-\uAB67\uFB00-\uFB06\uFF21-\uFF3A\uFF41-\uFF5A]

___________________

Doing the same for punctuation ( \p{P} ) for Latin :

[\p{Block=Basic_Latin}\p{Block=Latin_1_Supplement}\p{Block=Latin_Extended_A}\p{Block=Latin_Extended_Additional}\p{Block=Latin_Extended_B}\p{Block=Latin_Extended_C}\p{Block=Latin_Extended_D}\p{Block=Latin_Extended_E}\p{Script=Latin}\p{Script_Extensions=Latin}](?<=\p{P})

yields this JavaScript usable class :

[!-#%-*,-/:-;?-@[-]_{}\u00A1\u00A7\u00AB\u00B6-\u00B7\u00BB\u00BF\u10FB\uA92E]

______________

Both can be combined with the white space construct \s to get a reasonable
name validation regex.

/^(?:[\s!-#%-*,-\/:-;?-@[-]_{}\u00A1\u00A7\u00AB\u00B6-\u00B7\u00BB\u00BF\u10FB\uA92E]*[0-9A-Z_a-z\u00AA\u00B5\u00BA\u00C0-\u00D6\u00D8-\u00F6\u00F8-\u02B8\u02E0-\u02E4\u0363-\u036F\u0485-\u0486\u0951-\u0952\u1D00-\u1D25\u1D2C-\u1D5C\u1D62-\u1D65\u1D6B-\u1D77\u1D79-\u1DBE\u1E00-\u1EFF\u2071\u207F\u2090-\u209C\u20F0\u212A-\u212B\u2132\u214E\u2183-\u2184\u2C60-\u2C7F\uA722-\uA788\uA78B-\uA7BF\uA7C2-\uA7C6\uA7F7-\uA7FF\uAB30-\uAB5A\uAB5C-\uAB67\uFB00-\uFB06\uFF21-\uFF3A\uFF41-\uFF5A]+)+[\s!-#%-*,-\/:-;?-@[-]_{}\u00A1\u00A7\u00AB\u00B6-\u00B7\u00BB\u00BF\u10FB\uA92E]*$/

Expanded

^
(?:
   [\s!-#%-*,-/:-;?-@[-]_{}\u00A1\u00A7\u00AB\u00B6-\u00B7\u00BB\u00BF\u10FB\uA92E]*  
   [0-9A-Z_a-z\u00AA\u00B5\u00BA\u00C0-\u00D6\u00D8-\u00F6\u00F8-\u02B8\u02E0-\u02E4\u0363-\u036F\u0485-\u0486\u0951-\u0952\u1D00-\u1D25\u1D2C-\u1D5C\u1D62-\u1D65\u1D6B-\u1D77\u1D79-\u1DBE\u1E00-\u1EFF\u2071\u207F\u2090-\u209C\u20F0\u212A-\u212B\u2132\u214E\u2183-\u2184\u2C60-\u2C7F\uA722-\uA788\uA78B-\uA7BF\uA7C2-\uA7C6\uA7F7-\uA7FF\uAB30-\uAB5A\uAB5C-\uAB67\uFB00-\uFB06\uFF21-\uFF3A\uFF41-\uFF5A]+
)+
[\s!-#%-*,-/:-;?-@[-]_{}\u00A1\u00A7\u00AB\u00B6-\u00B7\u00BB\u00BF\u10FB\uA92E]*  
$

Upvotes: 1

Related Questions