ian
ian

Reputation: 61

Regex for Hebrew, English, Symbols

as part of a small program i'm writing i need to filter a String input that might be "gibrish" (any sign in UTF8) the input can be hebrew and/or english but also have all the normal signs like : ?%$!@'_' and so on...

a friend suggested to use regex, but due to my inexperience with using it i come to you for advice.

how can i create a C# function to check an input text and if it's not "right" return false

my try so far is:

public static bool shortTest(string input)
    {
        string pattern = @"^[אבגדהוזחטיכלמנסעפצקרשתץףןםa-zA-Z0-9\_]+$";
        Regex regex = new Regex(pattern);
        return regex.IsMatch(input);
    }

all the chars after "[" and to "a" are hebrew

Upvotes: 6

Views: 5831

Answers (2)

oCcSking
oCcSking

Reputation: 928

For Hebrew letters, in C# you can do somthing like that:

return System.Text.RegularExpressions.Regex.IsMatch(value, @"^[א-ת]+$");

enjoy =)

Upvotes: 9

Casimir et Hippolyte
Casimir et Hippolyte

Reputation: 89566

You can use the \p{IsHebrew} character class instead of enumerate all hebrew characters, \w for [a-zA-Z0-9_] and \s for spaces, tabs, newlines. You can add too dots, comma... An example :

^[\p{IsHebrew}\w\s,.?!;:-]+$

or

^[\p{IsHebrew}\w\s\p{P}]+$

\p{P} stands for all ponctuation signs (as far i know: .,?!:;-_(){}[]\/'"&#@%*)

Upvotes: 3

Related Questions