user1386320
user1386320

Reputation:

How to chech Bosnian-specific characters in RegEx?

I have this Regular Expression pattern, which is quite simple and it validates if the provided string is "alpha" (both uppercase and lowercase):

var pattern = /^[a-zA-Z]+$/gi;

When I trigger pattern.test('Zlatan Omerovic') it returns true, however if I:

pattern.test('Zlatan Omerović');

It returns false and it fails my validation.

In Bosnian language we have these specific characters:

š đ č ć ž

And uppercased:

Š Đ Č Ć Ž

Is it possible to validate these characters (both cases) with JavaScript regular expression?

Upvotes: 5

Views: 3274

Answers (3)

T.J. Crowder
T.J. Crowder

Reputation: 1074989

a-zA-Z means exactly that, and in an English-centric way: abcdefghijklmnopqrstuvwxyz. Sadly, with JavaScript's regular expressions, if you want to test other alphabetic characters, you have to specify them specifically. JavaScript doesn't have a locale-sensitive "alpha" definition. To include non-English alphabetic characters, you have to include them on purpose. You can either do that literally (for instance, by including š in the regular expression), or using Unicode escape sequences (such as \u0161). If the additional Bosnian alphabetic characters in question have a contiguous range, you can use the - notation with them as well, but it has to be separate from the a-z, which is defined in English terms.

Upvotes: 2

ryanbrill
ryanbrill

Reputation: 2011

Sure, you can just add those characters to the list of characters your matching. Also, since you're doing a case insensitive match (the i flag), you don't need the uppercase characters.

var pattern = /^[a-zšđčćž ]+$/gi;

Fiddle here: http://jsfiddle.net/ryanbrill/KB74b/

Here's an alternate pattern, which uses the unicode representation, which might be better (embedding the characters won't work if the file isn't saved with the proper encoding, for instance)

var pattern = /^[a-z\u0161\u0111\u010D\u0107\u017E ]+$/gi;

http://jsfiddle.net/ryanbrill/KB74b/2/

Upvotes: 9

HopeNick
HopeNick

Reputation: 244

To include in test result the first (S-based) symbol of your five I did:

var pattern = /^[a-zA-Z\u0160-\u0161]+$/g;

Try to add all the symbols you need this way ;)

Upvotes: 1

Related Questions