Reputation: 3596
I have the below string examples
1# 00000 Gin-a19ea68e-64bf-4471-b4d1-44f6bd9c1708-62fa6ae2-599c-4ff1-8249-bf6411ce3be7-83930e63-2149-40f0-b6ff-0838596a9b89 Kin
2# 00000 Gin-a19ea68e-64bf-4471-b4d1-44f6bd9c1708 Kin
I am trying to remove the uuid4 generated string and any text that comes to the right of uuid4 string pattern in python.
The output should be 00000 Gin
in both the examples
I have checked here What is the correct regex for matching values generated by uuid.uuid4().hex?. But still doesnt help.
Upvotes: 1
Views: 2323
Reputation: 43169
You could use:
import re
strings = ["00000 Gin-a19ea68e-64bf-4471-b4d1-44f6bd9c1708-62fa6ae2-599c-4ff1-8249-bf6411ce3be7-83930e63-2149-40f0-b6ff-0838596a9b89 Kin",
"00000 Gin-a19ea68e-64bf-4471-b4d1-44f6bd9c1708 Kin"]
rx = re.compile(r'^[^-]+')
# match the start and anything not - greedily
new_strings = [match.group(0)
for string in strings
for match in [rx.search(string)]
if match]
print(new_strings)
# ['00000 Gin', '00000 Gin']
^
(?P<interesting>.+?) # before
(?P<uid>\b\w{8}-(?:\w{4}-){3}\w{12}\b) # uid
(?P<junk>.+) # garbage
$
See a demo for this one on regex101.com (mind the modifiers!).
Upvotes: 1