Stephan
Stephan

Reputation: 43053

POSIX character equivalents in Java regular expressions

I would like to use a regular expression like this in Java : [[=a=][=e=][=i=]].

But Java doesn't support the POSIX classes [=a=], [=e=] etc.

How can I do this? More precisely, is there a way to not use US-ASCII?

Upvotes: 8

Views: 9671

Answers (3)

Johan Sjöberg
Johan Sjöberg

Reputation: 49227

Java does support posix character classes. The syntax is just different, for instance:

\p{Lower}
\p{Upper}
\p{ASCII}
\p{Alpha}
\p{Digit}
\p{Alnum}
\p{Punct}
\p{Graph}
\p{Print}
\p{Blank}
\p{Cntrl}
\p{XDigit}
\p{Space}

Upvotes: 15

ahmet alp balkan
ahmet alp balkan

Reputation: 45302

Quoting from http://download.oracle.com/javase/1.6.0/docs/api/java/util/regex/Pattern.html

POSIX character classes (US-ASCII only)

\p{Lower}   A lower-case alphabetic character: [a-z]
\p{Upper}   An upper-case alphabetic character:[A-Z]
\p{ASCII}   All ASCII:[\x00-\x7F]
\p{Alpha}   An alphabetic character:[\p{Lower}\p{Upper}]
\p{Digit}   A decimal digit: [0-9]
\p{Alnum}   An alphanumeric character:[\p{Alpha}\p{Digit}]
\p{Punct}   Punctuation: One of !"#$%&'()*+,-./:;<=>?@[\]^_`{|}~
\p{Graph}   A visible character: [\p{Alnum}\p{Punct}]
\p{Print}   A printable character: [\p{Graph}\x20]
\p{Blank}   A space or a tab: [ \t]
\p{Cntrl}   A control character: [\x00-\x1F\x7F]
\p{XDigit}  A hexadecimal digit: [0-9a-fA-F]
\p{Space}   A whitespace character: [ \t\n\x0B\f\r]

Upvotes: 6

Amir Raminfar
Amir Raminfar

Reputation: 34179

Copied from here

Java does not support POSIX bracket expressions, but does support POSIX character classes using the \p operator. Though the \p syntax is borrowed from the syntax for Unicode properties, the POSIX classes in Java only match ASCII characters as indicated below. The class names are case sensitive. Unlike the POSIX syntax which can only be used inside a bracket expression, Java's \p can be used inside and outside bracket expressions.

Upvotes: 2

Related Questions