Faiz Mohamed Haneef
Faiz Mohamed Haneef

Reputation: 3596

Remove uuid4 string pattern

I have the below string examples

1# 00000 Gin-a19ea68e-64bf-4471-b4d1-44f6bd9c1708-62fa6ae2-599c-4ff1-8249-bf6411ce3be7-83930e63-2149-40f0-b6ff-0838596a9b89 Kin

2# 00000 Gin-a19ea68e-64bf-4471-b4d1-44f6bd9c1708 Kin

I am trying to remove the uuid4 generated string and any text that comes to the right of uuid4 string pattern in python.

The output should be 00000 Gin in both the examples

I have checked here What is the correct regex for matching values generated by uuid.uuid4().hex?. But still doesnt help.

Upvotes: 1

Views: 2323

Answers (1)

Jan
Jan

Reputation: 43169

You could use:

import re

strings = ["00000 Gin-a19ea68e-64bf-4471-b4d1-44f6bd9c1708-62fa6ae2-599c-4ff1-8249-bf6411ce3be7-83930e63-2149-40f0-b6ff-0838596a9b89 Kin",
"00000 Gin-a19ea68e-64bf-4471-b4d1-44f6bd9c1708 Kin"]

rx = re.compile(r'^[^-]+')
# match the start and anything not - greedily

new_strings = [match.group(0)
                for string in strings
                for match in [rx.search(string)]
                if match]
print(new_strings)
# ['00000 Gin', '00000 Gin']


See a demo on ideone.com.
To actually check if your string is of the desired format, you could use the following expression:

^
(?P<interesting>.+?)                   # before
(?P<uid>\b\w{8}-(?:\w{4}-){3}\w{12}\b) # uid
(?P<junk>.+)                           # garbage
$

See a demo for this one on regex101.com (mind the modifiers!).

Upvotes: 1

Related Questions