taga
taga

Reputation: 3895

Making Regex combination of multiple chars and numbers combined

How to write a regex that will combine numbers and chars in a string in any order? For example, If I want to read some kind of invoice number, I have example like this:

example 1: 32ah3af
example 2: 32ahPP-A2ah3af
example 3: 3A63af-3HHx3B-APe5y5-9OPiis
example 4: 3A63af 3HHx3B APe5y5 9OPiis

So each 'block' have length between 3 and 7 chars (letters or numbers) that can be in any order (letters can be lowercase or uppercase). Each. 'block' can start with letter or with number. It can have one "block" or max 4 blocks that are separated with ' ' or -.

I know that I can make separators like: \s or \-, but I have no idea how to make these kind of blocks that have (or do not have) separator.

I tried with something like this:

([0-9]?[A-z]?){3,7}

But it does not work

Upvotes: 0

Views: 1102

Answers (2)

The fourth bird
The fourth bird

Reputation: 163467

You could use

^[A-Za-z0-9]{3,7}(?:[ -][A-Za-z0-9]{3,7}){0,3}\b

The pattern matches:

  • ^ Start of string
  • [A-Za-z0-9]{3,7} Match 3-7 times either a lower or uppercase char a-z or number 0-9
  • (?: Non capture group
    • [ -][A-Za-z0-9]{3,7} Match either a space or - and 3-7 times either a lower or uppercase char a-z or number 0-9
  • ){0,3} Close the non capture group and repeat 0-3 times to have a maximum or 4 occurrences
  • \b A word boundary to prevent a partial match

Regex demo

Note that [A-z] matches more than [A-Za-z0-9]

Upvotes: 2

sophros
sophros

Reputation: 16700

As long as you want to only capture / search for the invoice ids, the suggestion from Hao Wu is valid:

 r'\w{3,7}'

for regex (check here).

If you can drop the remaining part, then this should be enough.

You can more precisely capture the whole string with example 1:

r'example (\d+): ((\w{3,7}[\- ]?)+)'

See here how it works. Please note how capturing groups are represented.

Upvotes: 0

Related Questions