abhijeet_chib
abhijeet_chib

Reputation: 11

Regular Expressions for specific number patterns

I have an invoice in readable form. I need to extract PO number from the invoice. The PO numbers come in a particular format (26123456, 26234567). It starts with 26 and has 6 numbers following it. I am trying to extract it using regular expressions.

I have passed this as my parameters.

[26]\d{6,6} also I have tried this ^[26]\d{6,6}

However, the problems I am facing are:

If the PO number is 26454545 and before the PO number there are other numbers in the invoice such as Telephone numbers which have in them a substring with 26, its extracting that as well. For ex. 12345678987 this number is being extracted as well since there is 2 and 6 present in the substring.

Upvotes: 0

Views: 130

Answers (1)

Avinash Raj
Avinash Raj

Reputation: 174796

Remove the character class and add word boundaries.

\b26\d{6}\b
  • [26] will match a single character from the given list whether it may be 2 or 6. To match a number 26, just use the number as it is.

  • By adding \b at the start and at the end helps to match a complete number. Since \b matches between a word character and a non-word character. You could also use assertions here like (?<!\d)26\d{6}(?!\d) .

There is another pattern that i want to extract 12300012345. after the first three numbers there are always 3 zeros followed by 5 numbers.

\b\d{3}000\d{5}\b

If you want to combine the both, then you need to use the regex alternation operator |

\b26\d{6}\b|\b\d{3}000\d{5}\b

Upvotes: 1

Related Questions