Animesh Pandey
Animesh Pandey

Reputation: 6018

regex matching yelling words and emojis in a string

I have the following regex for matching a phrases with all letters in upper case

private static String ALL_CAPS_REGEXP = "\\b[A-Z\\s]+\\b";

but this does not match strings like ;D, :P, :O etc.

A few examples are:

that I want to match. It's something like ignore any character that is not an alphabet but rest should be uppercase

Assume that the alphabets used in any emoji are uppercase only.

What change in the regex I should make to match emoji like strings that have uppercase characters?

Upvotes: 0

Views: 652

Answers (1)

Andreas
Andreas

Reputation: 159135

For a regex "to match emoji like strings that have uppercase characters", we need to declare what "emoji like strings" mean.

Since emojis use (combinations of) various punctuation marks, and you limit to emojis using uppercase letters, you could declare that any combination of punctuation marks and optional uppercase letters is an emoji.

In that case, just list the punctuation marks in the character class.

"\\b[A-Z\\s!@#$%^&*()\\_-+={}[\\]:;\"'<>?,./]+\\b"

Or maybe more descriptive using POSIX character classes:

"\\b[\\p{Upper}\\p{Space}\\p{Punct}]+\\b"

Potentially prefixed by (?U) for full unicode/international support.

You would probably also want to filter out single-character matches, otherwise an input like "I rock." will return I and ., so use {2,} instead of +.

"(?U)\\b[\\p{Upper}\\p{Space}\\p{Punct}]{2,}\\b"

Upvotes: 1

Related Questions