Neret
Neret

Reputation: 185

All words in text with more than 1 uppercase characters with regex

How to select all words in text with more than 1 uppercase characters? I managed to select a certain word with this line:

(?<![a-z])word(?![a-z])

But I'm not sure how to select words like SElect, SeLeCt, SelecT, seleCT, selEcT.

Upvotes: 2

Views: 194

Answers (3)

Ryszard Czech
Ryszard Czech

Reputation: 18631

Let me also suggest a fully Unicode regex:

/(?<!\p{L})(?:\p{Ll}*\p{Lu}){2}\p{L}*(?!\p{L})/gu

See proof.

Explanation:

--------------------------------------------------------------------------------
  (?<!                     look behind to see if there is not:
--------------------------------------------------------------------------------
    \p{L}                  any Unicode letter
--------------------------------------------------------------------------------
  )                        end of look-behind
--------------------------------------------------------------------------------
  (?:                      group, but do not capture (2 times):
--------------------------------------------------------------------------------
    \p{Ll}*                 any lowercase Unicode letter (0 or more
                             times (matching the most amount possible))
--------------------------------------------------------------------------------
    \p{Lu}                   any uppercase Unicode letter
--------------------------------------------------------------------------------
  ){2}                     end of grouping
--------------------------------------------------------------------------------
  \p{L}*                   any Unicode letter (0 or more
                           times (matching the most amount possible))
--------------------------------------------------------------------------------
  (?!                      look ahead to see if there is not:
--------------------------------------------------------------------------------
    \p{L}                   any Unicode letter
--------------------------------------------------------------------------------
  )                        end of look-ahead

JavaScript:

const regex = /(?<!\p{L})(?:\p{Ll}*\p{Lu}){2}\p{L}*(?!\p{L})/gu;
const string = "SEEEEect, SeLeCt, SelecT, seleCT, selEcT select, seleCT, selEcT select, donselect";
console.log(string.match(regex));

Upvotes: 0

The fourth bird
The fourth bird

Reputation: 163517

You might use a pattern to assert what is at the right is the "word" and match 2 uppercase characters surrounded by optional upper and lowercase characters

(?<![a-zA-Z])[a-z]*[A-Z][a-z]*[A-Z][A-Za-z]*(?![a-zA-Z])

Explanation

  • (?<![a-zA-Z]) Assert not a-zA-Z at the left
  • [a-z]*[A-Z] Match optional chars a-z followed by A-Z to match the first uppercase char
  • [a-z]*[A-Z] Match again optional chars a-z followed by A-Z to match the second uppercase char
  • [a-zA-Z]* Match optional chars a-zA-Z
  • (?![a-zA-Z]) Assert not a-zA-Z at the right

Regex demo

Upvotes: 3

Robert
Robert

Reputation: 2763

const regex = /([a-z]*[A-Z]|[A-Z][a-z]*){2,}\b/g
const str = "SEEEEect, SeLeCt, SelecT, seleCT, selEcT select, seleCT, selEcT select, donselect"

const match = str.match(regex)
console.log(match)

Upvotes: 0

Related Questions