Cesar Lopes
Cesar Lopes

Reputation: 413

Repeated regex pattern

I'm in need to make a regex to match some String formats, I'm not really good to Regex, that is why I'm asking for your help.

I receive a bank of Strings, they start with the following format:

Examples: "1ASF1-1-A42-A_4-214A-GarbageText", "DI-21f-112rf-A-214_124_12412A_GarbageText", "312c_12412_1241-12rf-001-GarbageText"

An AlfaNumeric that has from 1 - 20 characters + ( - or _) repeated any times (can't know how many repeatings) Then it has some garbage text.

How can I make a regex to find if the String starts with the pattern I want? I think It would be something like:

[a-zA-Z0-9]{1,20}[_-]+

Upvotes: 0

Views: 55

Answers (2)

The fourth bird
The fourth bird

Reputation: 163632

If you want to match the whole string, you can start with the pattern 1-20 chars, and then optionally repeat - or _ followed by again the pattern 1-20 chars.

[a-zA-Z0-9]{1,20}(?:[-_][a-zA-Z0-9]{1,20})*

Regex demo

If you want the match until the last occurrence of - or _ when there are not consecutive dashes or underscores like --:

[a-zA-Z0-9]{1,20}(?:[-_][a-zA-Z0-9]{1,20})*[_-]

Regex demo

If they can also be mixed:

[a-zA-Z0-9](?:[-_a-zA-Z0-9]*[-_])

Regex demo

Upvotes: 1

OneCricketeer
OneCricketeer

Reputation: 192023

What've you've written means the string must end in one-to-many - or _ characters, not have just one, then "garbage text".

You need to group the repeated pattern.

Then if you have "garbage text", then you can use .+

([a-zA-Z0-9]{1,20}[_-])+.+

But this will include spaces and any other symbols rather than just [a-zA-Z0-9]... \w+ might be "safer"

Upvotes: 2

Related Questions