Reputation: 41
I'm using python to parse out an SDDL using regex. The SDDL is always in the form of 'type:some text' repeated up to 4 times. The types can be either 'O', 'G', 'D', or 'S' followed by a colon. The 'some text' will be variable in length.
Here is a sample SDDL:
O:DAG:S-1-5-21-2021943911-1813009066-4215039422-1735D:(D;;0xf0007;;;AN)(D;;0xf0007;;;BG)S:NO_ACCESS_CONTROL
Here is what I have so far. Two of the tuples are returned just fine, but the other two - ('G','S-1-5-21-2021943911-1813009066-4215039422-1735') and ('S','NO_ACCESS_CONTROL') are not.
import re
sddl="O:DAG:S-1-5-21-2021943911-1813009066-4215039422-1735D:(D;;0xf0007;;;AN)(D;;0xf0007;;;BG)S:NO_ACCESS_CONTROL"
matches = re.findall('(.):(.*?).:',sddl)
print matches
[('O', 'DA'), ('D', '(D;;0xf0007;;;AN)(D;;0xf0007;;;BG)')]
what I'd like to have returned is
[('O', 'DA'), ('G','S-1-5-21-2021943911-1813009066-4215039422-1735'), ('D', '(D;;0xf0007;;;AN)(D;;0xf0007;;;BG)'),('S','NO_ACCESS_CONTROL')]
Upvotes: 4
Views: 893
Reputation: 1789
It seems like using regex isn't the best solution to this problem. Really, all you want to do is split across the colons and then do some transformations on the resulting list.
chunks = sddl.split(':')
pairs = [(chunks[i][-1], chunks[i+1][:-1] \
if i < (len(chunks) - 2) \
else chunks[i+1])
for i in range(0, len(chunks) - 1)]
Upvotes: 0
Reputation: 208555
Try the following:
(.):(.*?)(?=.:|$)
Example:
>>> re.findall(r'(.):(.*?)(?=.:|$)', sddl)
[('O', 'DA'), ('G', 'S-1-5-21-2021943911-1813009066-4215039422-1735'), ('D', '(D;;0xf0007;;;AN)(D;;0xf0007;;;BG)'), ('S', 'NO_ACCESS_CONTROL')]
This regex starts out the same way as yours, but instead of including the .:
at the end as a part of the match, a lookahead is used. This is necessary because re.findall()
will not return overlapping matches, so you need each match to stop before the next match begins.
The lookahead (?=.:|$)
essentially means "match only if the next characters are anything followed by a colon, or we are at the end of the string".
Upvotes: 2