Commata
Commata

Reputation: 195

RegEx to find any uppercase word followed by a colon

I need a RegEx to match an uppercase string ending with a colon. The string can contain spaces, numbers and periods. So that if:

mystring = "I have a C. GRAY CAT2:"

I want the coldfusion expression

REFind("[A-Z0-9. ][:]",mystring) 

to return the number 9, matching "C. GRAY CAT2:". Instead, it is returning the number 21, matching only the colon. I hope that a correction of the regex will solve the problem. Of course I have tried many, many things. Thank you!

Upvotes: 2

Views: 2536

Answers (2)

Commata
Commata

Reputation: 195

Have revised the selected answer to my own question to cover the German special characters.

[A-Z][A-Z0-9.ÜÄÖß ]*:

This appears to work, however the Germans have recently added a capital ß to their alphabet, which is surely not on most keyboards yet, and therefore will not be a problem for the RegEx for a while.

Upvotes: 0

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 627100

I suggest using

[A-Z0-9][A-Z0-9. ]*:

See the regex demo

Details

  • [A-Z0-9] - an uppercase letter or digit (in case the first char can be a digit, else remove 0-9)
  • [A-Z0-9. ]* - zero or more uppercase letters/digits, . or space
  • : - a colon.

Variations

To avoid matching 345: like substrings but still allow 23 VAL: like ones, use

\b(?=[0-9. ]*[A-Z])[A-Z0-9][A-Z0-9. ]*:

See this regex demo. Here, \b(?=[0-9. ]*[A-Z]) matches a word boundary first, and then the positive lookahead (?=[0-9. ]*[A-Z]) makes sure there is an uppercase letter after 0+ digits, spaces or dots.

If you do not expect numbers at the start of the sequence, i.e. out of I have a 22 C. GRAY CAT2:, you need to extract C. GRAY CAT2, use Sebastian's suggestion (demo).

Upvotes: 2

Related Questions