JamesW
JamesW

Reputation: 41

regular expression to find a string that always contains 6 characters, CAPITALS and numbers only

I've been trying to catch a string (6 characters) like ABC123 (or any combination of Capitals and numbers) using a regular expression. I can catch ABCDE1 or 1ABCDE or even AC34FG. As long as the string contains at least 1 CAPITAL and 1 number the regular expression works just fine. But something like ABCDEF or 123456 does not! What am I missing? The regular expression I use is:

(?<=\t)([0-9]+[A-Z]+|[A-Z]+[0-9]+)[0-9A-Z]*(?=\t)

Any help would be appreciated! Thanks!

Upvotes: 1

Views: 1037

Answers (2)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 627327

In your (?<=\t)([0-9]+[A-Z]+|[A-Z]+[0-9]+)[0-9A-Z]*(?=\t) pattern, you explicitly require at least 1 digit to be followed with at least 1 letter (with [0-9]+[A-Z]+) (and vice versa with [A-Z]+[0-9]+) only in between tab chars.

To just match any 6 char substring in between tabs that consists of uppercase ASCII letters or digits, you may use

(?<=\t)[A-Z0-9]{6}(?=\t)

See this regex demo.

Or, to also match at the start/end of string:

(?<![^\t])[A-Z0-9]{6}(?![^\t])

See another regex demo.

Upvotes: 1

DesertEagle
DesertEagle

Reputation: 599

If i understand you correctly, your aproach is way too complicated.

/\b[A-Z0-9]{6}\b/

Catches any (exact) 6 character string, as long as either capitals or numbers or both are present. Note the \b part as a word boundary, you could change these delimiters to whatever fits your need.

Another word of warning: A-Z captures only 26 uppercase characters, Umlauts or accented characters will not be cought here, use something like \p{L} if your engine supports it and your data requires it. See https://www.regular-expressions.info/unicode.html for more details.

Upvotes: 0

Related Questions