Reputation: 648
I want to do something that seems simple, but I may end up writing a program to do this search.
I work on many projects in VB.NEt. The folder structure is what I'd assume is fairly common., for instance, of my project is called 'SalesLabrary', then it will have the following folder structure:
<Projects Folder>\SalesLibrary\SalesLibrary
In this folder, I have the bin, obj, Resources, and My Project folders. Now under the bin folder I have the release and debug folders. Of course, during development, my out put will got to the debug folder, but my final build goes in the release folder.
My problem is that sometimes I get behind in updating my repository where I keep the output of about 120 application extensions.
Anyway, the Release folder will normally contain the main DLL, but also the DLL's of any refernces I make within the project.
So that in the release folder of the sales library, I will have SalesLibrary.DLL
, but also the SalesTaxLibrary.dll
. The paths for the 2 would be as so:
<Project Folder>\SalesLibrary\SalesLibrary\bin\Release\SalesLibary.dll
<Project Folder>\SalesLibrary\SalesLibrary\bin\Release\SalesTaxLibary.dll
What I'd like to figure out is how to build a regex I can use to select these files where the filename (w/o its extension) matches one of the components in it's path (i.e. \SalesLibrary\SalesLibary.dll
) and if so, return true. In the case first file above, SalesLibrary exist multiple times in it's path and would be selected.
My regex skill are below par and I'm really having a hard time with this one. Alternatively, I've always wanted a reason to learn some Java.
Upvotes: 0
Views: 83
Reputation: 338148
Here is the naive approach to do this:
Bummer, regular expressions work from left to right, but the filename is at the end of the string. So we must find a way to match the filename first and check the remainder of the string after that.
Method one
Reverse the string, use the above approach.
reverse the string
@"lld.yrarbiLselaS\yrarbiLselaS\foo\:C"
capture filename without extension (reversed, i.e. extension first): ^[^.]+\.([^\\]+)
, that's
^ # start-of-string
[^.]+ # extension
\. # a dot
([^\\]+) # filename (anything but backslashes), capture to group 1
check following sections of the path for group 1: (?:\\(?!\1)[^\\]+)*
, that's:
(?: # non-capturing group, matches one path component
\\ # a backslash
(?!\1) # look-ahead: anything but group 1 is allowed
[^\\]+ # match it
)* # end group, repeat
$ # end-of-string
In one part: ^[^.]+\.([^\\]+)(?:\\(?!\1)[^\\]+)*$
.
If this matches, the file name is not repeated anywhere else in the path. (Don't forget to reverse the path before.)
Method two
Use the variable length look-behind feature of the .NET regex engine to roll up the string from the end.
keep the string
@"C:\foo\SalesLibrary\SalesLibrary.dll"
get the filename: ([^\\]+)\.[^.]+$
, that's
([^\\]+) # filename (anything but backslashes), capture to group 1
\. # a dot
[^.]+ # extension
$ # end-of-string
after group 1, insert a look-behind that looks until the start of the string
(?<= # positive look-behind, i.e. it must be true that:
^ # start-of-string
(?: # non-capturing group, matches a path component
(?!\1) # look-ahead: anything but group 1 is allowed
[^\\]* # match it
\\ # a backslash
)* # end group, repeat
\1 # match the filename again (see note below)
) # end look-behind
In one part: ([^\\]+)(?<=^(?:(?!\1)[^\\]*\\)*\1)\.[^.]+$
Note: since we inserted the look-behind between the filename and the extension, we need to match the filename itself inside the look-behind, because it is part of what we are looking behind at.
Like before, if this matches, the file name is not repeated anywhere else in the path.
Method three
Just grab the filename, split the path on the backslashes, loop though the array looking for matches.
Frankly, that's what I would do. The regex examples above are purely academic.
Upvotes: 2