Yang HG
Yang HG

Reputation: 720

Capture multiple substrings in one group

Now, I have a folder path will contain a database table name and a ID, looks like:

path = '/something/else/TableName/000/123/456/789'

Of course I can match TableName/000/123/456/789 then split them by python script.

import re
matched = re.findall(r'.*?/(\w+(?:/\d+){4})', path)[0]  # TableName/000/123/456/789
split_text = matched.split('/')  # ['TableName', '000', '123', '456', '789']
table_name = split_text[0]  # 'TableName'
id = int(''.join(split_text[1:]))  # 123456789

.*?/(\w+(?:/\d+){4})

But I want to know, if there any function provided by regex can finish it in one step? I've tried these ways:

re.match(r'.*?/(?P<table_name>\w+)(?:/(?P<id>\d+)){4}', path).groupdict()  # {'table_name': 'TableName', 'id': '789'}
re.split(r'.*?/(\w+)(?:/(\d+)){4}', path)  # ['', 'TableName', '789', '']
re.sub(r'(.*?/)\w+(?:(/)\d+){4}', '', path)  # '', full string has been replaced

.*?/(?P\w+)(?:/(?P\d+)){4}

.*?/(\w+)(?:/(\d+)){4}

Is there anyway else? Or I must use the python script above? I hope the result is {'table_name': 'TableName', 'id': '000123456789'} or ('TableName', '000123456789'), at least ('TableName', '000', '123', '456', '789').

Upvotes: 0

Views: 61

Answers (2)

sanooj
sanooj

Reputation: 493

Easiest way would be to expand the grouping.

>>> match=re.search(r'.*?/(\w+)(?:/(\d+))(?:/(\d+))(?:/(\d+))(?:/(\d+))',a)
>>> match.groups()
('TableName', '000', '123', '456', '789')

Upvotes: 0

str028
str028

Reputation: 204

Simplest way is to avoid using quantifier:

re.findall('(\w+)\/(\d+)\/(\d+)\/(\d+)\/(\d+)', path)

[('TableName', '000', '123', '456', '789')]

Upvotes: 1

Related Questions