Marco
Marco

Reputation: 23937

Regex with lookaround to find group of numbers is inconclusive

I have a hierachical folder structure, which builds upon a nestes asset / project hierachy.

I need to traverse this folder structure, and move certain hierachy levels to other destinations.

To identify on which on which hierachiy level I am currently on and where the current folder needs to be moved, I want to use regular expressions.

I have got my expressions ready for the levels 1 through 4 and the project level, but Level 5 is inconclusive and I can't figure out why.

Given the following 2 examples (Demo):

  1. Level 4 (22^108^581^2116)
  2. Foo, Kings Road Level 5 (22^108^581^2116^7310)

The regex \((?<!\^)(\d{2,}\^{0,1}){4}(?!\^)\) matches only the level 4 asset, which is correct. The regex for level 5 is similar: \((?<!\^)(\d{2,}\^{0,1}){5}(?!\^)\) - I am increasing the amount of capturing groups from 4 to 5, but according to regex it matches level 4 as well as 5, which should not happen. So the goal is to match the following pattern:

  1. Opening parenthesis
  2. 5 groups of 2 to n digits
  3. the groups are divided a Caret
  4. The first group must not have a leading Caret
  5. The last group must not have a trailing Caret
  6. closing Parenthesis

What did I do wrong?

PS: If it is of any importance. The folders reside in a SharePoint document library and the code will run in Powershell.

Upvotes: 1

Views: 63

Answers (1)

Matt
Matt

Reputation: 46710

You don't need to worry about criteria 4 and 5 the way you are. As long as the leading bracket is followed by digits and the last bracket is also preceeded by digits you should be fine.

\((\d{2,}\^){4}\d{2,}\)

Matches the outside braces as well as 4 groups of digits and a trailing caret as well as one last group of digits. If you are looking to match level 3 then change the 4 to 2 in the above regex.

It there was leading or trailing carets they would not be matched.

Named Matches

Depending on how you are using these values later it might be beneficial to looks at named matches in PowerShell. What we are going to do it build a custom regex match string based on the number of levels that you are trying to match against.

$matchNumberOfLevels = 5
$regex = "\(" + 
    ((1..($matchNumberOfLevels-1) | ForEach-Object{"(?<level$_>\d{2,})\^"}) -join "") +
    "(?<level$matchNumberOfLevels>\d{2,})\)"

"Foo, Kings Road Level 5 (22^108^581^2116^7310)" -match $regex

For each of those levels (1 to 5 in the example above) we make a named match called level_n_ where n is the position of the caret delimited number. So then we you look at matches you will get named matches that you can use later in your code.

$matches

Name                           Value                                                                                                 
----                           -----                                                                                                 
level3                         581                                                                                                   
level2                         108                                                                                                   
level4                         2116                                                                                                  
level5                         7310                                                                                                  
level1                         22                                                                                                    
0                              (22^108^581^2116^7310) 

$matches.level1
22

Cool but might be overboard.

Split result

A simple trim and split would get something simlar just not with the fancy names.

$matchNumberOfLevels = 5
"Foo, Kings Road Level 5 (22^108^581^2116^7310)" -match "\((\d{2,}\^){$($matchNumberOfLevels - 1)}\d{2,}\)"
$levels = $Matches[0].Trim("()") -split "\^"
$levels[0]

So $levels is an array with 5 elements corresponding to the level of you system.

Note this logic fails for level 1 only matches.

Upvotes: 2

Related Questions