Reputation: 3834
I have a string such as:
s = "The code for the product is A8H4DKE3SP93W6J and you can buy it here."
The text in this string will not always be in the same format, it will be dynamic, so I can't do a simple find and replace to obtain the product code.
I can see that:
re.sub(r'A[0-9a-zA-Z_]{14} ', '', s)
will get ride of the product code. How do I go about doing the opposite of this, i.e. deleting all of the text, apart from the product code? The product code will always be a 15 character string, starting with the letter A.
I have been racking my brain and Googling to find a solution, but can't seem to figure it out.
Thanks
Upvotes: 0
Views: 210
Reputation: 4694
In regex, you can match on the portion you want to keep for substituting by using braces around the pattern and then referring to it in the sub-pattern with backslash followed by the index for that matching portion. In the code below, "(A[0-9A-Za-z_]{14})" is the portion you want to match, and you can substitute in the resulting string using "\1".
re.sub(r'.*(A[0-9A-Za-z_]{14}).*', r'\1', s)
Upvotes: 0
Reputation: 473833
Instead of substituting the rest of the string, use re.search()
to search for the product number:
In [1]: import re
In [2]: s = "The code for the product is A8H4DKE3SP93W6J and you can buy it here."
In [3]: re.search(r"A[0-9a-zA-Z_]{14}", s).group()
Out[3]: 'A8H4DKE3SP93W6J'
Upvotes: 1