Reputation: 22847
I want to parse out all attributes from the LDAP distinguished name. The attribute starts with comman or the line begin, ends with comma or the line end.
I've written the following:
String patternStr = "[^,][A-Z]+=([A-Za-z0-9]+)[,$]";
String str = "CN=USERID003,OU=Users,DC=intern,DC=mycompany,DC=pl";
Pattern pattern = Pattern.compile(patternStr);
Matcher m = pattern.matcher(str);
while (m.find()) {
String substr = str.substring(m.start(), m.end());
System.out.println(substr);
System.out.println(m.group(1));
}
The output is:
CN=USERID003,
USERID003
OU=Users,
Users
DC=intern,
intern
DC=mycompany,
mycompany
Matching of the start with [^,]
functions correctly, but the block [,$]
is matching only commans, not the end of the line.
How to match as the substring end both comma and the line end?
Upvotes: 0
Views: 4715
Reputation: 271
You could change your pattern "[^,][A-Z]+=([A-Za-z0-9]+)[,$]"
to "(?:^|,)[A-Z]+=([A-Za-z0-9]+)(?:,|$)"
Then, you'll get your desired results.
I guess your previous problem is:
In [...] Character Grouping, only Characters are included, not 'Boundary matchers'.
Meanwhile, [^,] means any characters except ','. while [,$] means character',' or character '$', no any boundary-matchers meaning.
Upvotes: 0
Reputation: 784998
You can use this lookbehind based regex for matching:
(?<=,|^)([^=]+)=([^,]*)
Code:
String patternStr = "(?<=,|^)([^=]+)=([^,]*)";
String str = "CN=USERID003,OU=Users,DC=intern,DC=mycompany,DC=pl";
Pattern pattern = Pattern.compile(patternStr);
Matcher m = pattern.matcher(str);
while (m.find()) {
System.out.printf("%s : %s%n", m.group(1), m.group(2));
}
Output:
CN : USERID003
OU : Users
DC : intern
DC : mycompany
DC : pl
Upvotes: 1
Reputation: 833
I would advise you to forget about the pattern and matcher and use String.split()
instead - it gives all the functionality that you want and the code is more readable.
String str = "CN=USERID003,OU=Users,DC=intern,DC=mycompany,DC=pl";
String[] attrs = str.split(",")
for (String attr : attrs) {
System.out.println(attr);
System.out.println(attr.split("=")[1])
}
Hope this helps!
Upvotes: 2
Reputation: 111
Why wouldn't you use str.split() ? And then use "for" to search all " XX = YYYY ", and then split again if you only need the attribute name or its value.
Upvotes: 1
Reputation: 31290
This should do what you want according to your description
String patternStr = "(?:^|,)[A-Z]+=([A-Za-z0-9]+)(?:,|$)";
Match starts at a begin of line/string or comma, and ends at a comma or end of line/string.
Upvotes: 3