Reputation: 95
I have created a Regex Pattern (?<=[TCC|TCC_BHPB]\s\d{3,4})[-_\s]\d{1,2}[,]
This Pattern match just:
TCC 6005_5,
What should I change to the end to match these both strings:
TCC 6005-5 ,
TCC 6005_5,
Upvotes: 0
Views: 103
Reputation: 163207
This part of the pattern [TCC|TCC_BHPB]
is a character class that matches one of the listed characters. It might also be written for example as [|_TCBHP]
To "match" both strings, you can match all parts instead of using a positive lookbehind.
\bTCC(?:_BHPB)?\s\d{3,4}[-_\s]\d{1,2}\s?,
See a regex demo
\bTCC
A word boundary to prevent a partial match, then match TCC
(?:_BHPB)?\s\d{3,4}
Optionally match _BHPB
, match a whitespace char and 3-4 digits (Use [0-9]
to match a digit 0-9)[-_\s]\d{1,2}
Match one of -
_
or a whitespace char\s?,
Match an optional space and ,
Note that \s
can also match a newline.
Using the lookbehind:
(?<=TCC(?:_BHPB)?\s\d{3,4})[-_\s]\d{1,2}\s?,
Or if you want to match 1 or more spaces except a newline
\bTCC(?:_BHPB)?[\p{Zs}\t][0-9]{3,4}[-_\p{Zs}\t][0-9]{1,2}[\p{Zs}\t]*,
Upvotes: 0
Reputation: 38727
You can add a non-greedy wildcard to your expression (.*?
):
(?<=(?:TCC|TCC_BHPB)\s\d{3,4})[-_\s]\d{1,2}.*?[,]
^^^
This will now also match any characters between the last digit and the comma.
As has been pointed out in the comments, [TCC|TCC_BHPB]
is a character class rather than a literal match, so I've changed this to (?:TCC|TCC_BHPB)
which is presumably what your intention was.
Upvotes: 1