Adam
Adam

Reputation: 119

Trouble matching LDAP entry with multiline regex

I have multiple log files with LDAP entries and I'm trying to match only the entries that have a createtimestamp during a certain date but capture the whole entry, not just the timestamp. The entries are as follows:

dn: ....
otherattr: 
...
createtimestamp: 20130621061525Z

The problem is that I am getting all of the entries that come before the one I want as well.

dn: ....
otherattr: 
...
createtimestamp: 20121221082545Z

dn: ....
otherattr: 
...
createtimestamp: 20130621061525Z

This is the expression:

dn_search = re.compile(r'dn: (.*?)createtimestamp: 20130[4-6]\d+?Z', flags=re.M|re.S)

I've tried some other expressions but I am either getting only the createtimestamp or unwanted entries. Any ideas?

Upvotes: 2

Views: 485

Answers (2)

Ro Yo Mi
Ro Yo Mi

Reputation: 15010

Description

This regex will assume each group of text start with dn: and ends with an empty line. It will then capture the entire group of lines, and capture the createtimestamp field's value

^dn:(?=(?:(?!^createtimestamp:|^dn:|^\s*(?:\r|\n\|$)|\Z).)*^createtimestamp:\s*([^\s\r\n]*))(?:(?!^dn:|^\s*(?:\r|\n\|$)|\Z).)*

enter image description here

Python Code example

Link to working example http://repl.it/J0t

Code

import re

string = """dn: ....
otherattr: 
...
createtimestamp: 20121221082545Z_1

dn: ....
otherattr: 
...
createtimestamp: 20130621061525Z_2
""";

for matchObj in re.finditer( r'^dn:(?=(?:(?!^createtimestamp:|^dn:|^\s*(?:\r|\n\|$)|\Z).)*^createtimestamp:\s*([^\s\r\n]*))(?:(?!^dn:|^\s*(?:\r|\n\|$)|\Z).)*', string, re.M|re.I|re.S):
    print "-------"
    print "matchObj.group(1) : ", matchObj.group(1)

Returns

-------
matchObj.group(1) :  20121221082545Z_1
-------
matchObj.group(1) :  20130621061525Z_2

Upvotes: 2

SteelPangolin
SteelPangolin

Reputation: 91

Don't try to parse LDIF by hand. It's not complicated, but things like attribute and name escaping, and line continuations for long lines, will bite you. Use the LDIF parser from python-ldap.

Upvotes: 2

Related Questions