Capture multiple substrings in one group

Question

Now, I have a folder path will contain a database table name and a ID, looks like:

path = '/something/else/TableName/000/123/456/789'

Of course I can match TableName/000/123/456/789 then split them by python script.

import re
matched = re.findall(r'.*?/(\w+(?:/\d+){4})', path)[0]  # TableName/000/123/456/789
split_text = matched.split('/')  # ['TableName', '000', '123', '456', '789']
table_name = split_text[0]  # 'TableName'
id = int(''.join(split_text[1:]))  # 123456789

.*?/(\w+(?:/\d+){4})

But I want to know, if there any function provided by regex can finish it in one step? I've tried these ways:

re.match(r'.*?/(?P\w+)(?:/(?P\d+)){4}', path).groupdict()  # {'table_name': 'TableName', 'id': '789'}
re.split(r'.*?/(\w+)(?:/(\d+)){4}', path)  # ['', 'TableName', '789', '']
re.sub(r'(.*?/)\w+(?:(/)\d+){4}', '', path)  # '', full string has been replaced

.*?/(?P\w+)(?:/(?P\d+)){4}

.*?/(\w+)(?:/(\d+)){4}

Is there anyway else? Or I must use the python script above? I hope the result is {'table_name': 'TableName', 'id': '000123456789'} or ('TableName', '000123456789'), at least ('TableName', '000', '123', '456', '789').

str028 · Accepted Answer

Simplest way is to avoid using quantifier:

re.findall('(\w+)\/(\d+)\/(\d+)\/(\d+)\/(\d+)', path)

[('TableName', '000', '123', '456', '789')]

Capture multiple substrings in one group

Answers (2)

Related Questions