erimerturk
erimerturk

Reputation: 4288

Regex for all alphabets

i need a regex for all alphabets. I have an input and target text. Both of them can be belong different alphabets. I mean they can be belong chinese, latin, cyrillic and any others alphabet.

I need a regex for multi language input and multi language target text.

Is there anybody has any idea about this? How can i write this regex ?

I will use this with javascript. But i think there should be common regex for java and javascript also for this problem.

Upvotes: 1

Views: 1955

Answers (3)

erimerturk
erimerturk

Reputation: 4288

i use "|" this character as a separator, so it is speacial for me. Key can be any character except of "|". it solve my problems thanks for answers. And it can be used with javascript, java and groovy. I tested it, worked.

var keyPrefix ="\\|[\u0000-\u007B\u007D-\uFFEF]*";
var keySuffix = "[\u0000-\u007B\u007D-\uFFEF]*\\|";
var searchkey = keyPrefix + key.toLowerCase() + keySuffix; 

Upvotes: 0

stema
stema

Reputation: 92976

If you are in Java (not in javascript!) you can use unicode properties, e.g.

\P{L} any kind of letter from any language.

See regular-expressions.info/unicode for more informations.

For Javascript:

There is a lib from XRegExp and some plugins XRegExp Unicode plugins that extends the javasript regex features. That adds support for Unicode categories, scripts, and blocks.

With those libs you would be able to use \p{L} with javascript.

See my answer to this question for a small example

Upvotes: 4

Kirill Polishchuk
Kirill Polishchuk

Reputation: 56162

Some regex engines support special character for all Unicode letters:

\p{L}

Or you can use \w - letter, digit, underscore

Upvotes: 2

Related Questions