badcoder
badcoder

Reputation: 3834

Python negative regex

I have a string such as:

s = "The code for the product is A8H4DKE3SP93W6J and you can buy it here."

The text in this string will not always be in the same format, it will be dynamic, so I can't do a simple find and replace to obtain the product code.

I can see that:

re.sub(r'A[0-9a-zA-Z_]{14} ', '', s)

will get ride of the product code. How do I go about doing the opposite of this, i.e. deleting all of the text, apart from the product code? The product code will always be a 15 character string, starting with the letter A.

I have been racking my brain and Googling to find a solution, but can't seem to figure it out.

Thanks

Upvotes: 0

Views: 210

Answers (2)

Praveen Yalagandula
Praveen Yalagandula

Reputation: 4694

In regex, you can match on the portion you want to keep for substituting by using braces around the pattern and then referring to it in the sub-pattern with backslash followed by the index for that matching portion. In the code below, "(A[0-9A-Za-z_]{14})" is the portion you want to match, and you can substitute in the resulting string using "\1".

re.sub(r'.*(A[0-9A-Za-z_]{14}).*', r'\1', s)

Upvotes: 0

alecxe
alecxe

Reputation: 473833

Instead of substituting the rest of the string, use re.search() to search for the product number:

In [1]: import re

In [2]: s = "The code for the product is A8H4DKE3SP93W6J and you can buy it here."

In [3]: re.search(r"A[0-9a-zA-Z_]{14}", s).group()
Out[3]: 'A8H4DKE3SP93W6J'

Upvotes: 1

Related Questions