Yan Sklyarenko
Yan Sklyarenko

Reputation: 32240

Match the lines that have a single slash before a special word

I have a number of lines in a similar format (actually, file paths). For example:

root/DATA/some/file.txt
root/DATA/another/file.txt
root/DATA/yet/another/file.exe
root/site/some/other/folder/before/DATA/file.xml
root/site/some/other/folder/DATA/file2.xml

I'd like to take only those that contain a single slash before DATA, that is the first 3 above should match, but the last 2 should not. NOTE: root is supposed to be a sequence of characters, excluding / and \.

I ended up with this regex, but it still matches all 5 samples:

[^/]*/data/.*

And I'm stuck here... How to instruct it to filter out the line in case DATA is not following explicitly after the first slash?

Upvotes: 0

Views: 87

Answers (4)

Bartosz Pachołek
Bartosz Pachołek

Reputation: 1308

you have many options - like you could capture all DATA and later check how many elements '/' are before the text DATA (in for example the first group); you could check for a longer string etc. - exactly the thing which you have requested you may simulate and re-use using the code:

string type_1 = "" +
    "root/DATA/some/file.txt" + "\n" +
    "root/DATA/another/file.txt" + "\n" +
    "root/DATA/yet/another/file.exe" + "\n" +
    "root/site/some/other/folder/before/DATA/file.xml" + "\n" +
    "root/site/some/other/folder/DATA/file2.xml";

Console.WriteLine ("Start TEXT:");
Console.WriteLine (type_1);


Console.WriteLine ("Result TEXT:");
MatchCollection mat = Regex.Matches (type_1, "^[^/]*/DATA.*?$", RegexOptions.Compiled|RegexOptions.Multiline);
Console.WriteLine (mat.Count);
foreach (Match m in mat) {
    Console.WriteLine (m.ToString ());  
}

result of it's work is:

Start TEXT:
root/DATA/some/file.txt
root/DATA/another/file.txt
root/DATA/yet/another/file.exe
root/site/some/other/folder/before/DATA/file.xml
root/site/some/other/folder/DATA/file2.xml
Result TEXT:
3
root/DATA/some/file.txt
root/DATA/another/file.txt
root/DATA/yet/another/file.exe

it works by assuming that no '/' can be before the first 'DATA'.

Upvotes: 0

F.P
F.P

Reputation: 17831

Regex regex = new Regex("^[^/]*/data/.*",
                        RegexOptions.IgnoreCase|RegexOptions.Multiline);

Upvotes: 0

Raman Zhylich
Raman Zhylich

Reputation: 3697

You should mark the start of the line:

^[^/]*/data/.*

Also, ensure that Regex is in multiline mode & case is ignored

Upvotes: 0

dda
dda

Reputation: 6203

This should fix your problem:

^[^/]*/DATA/.*$

Upvotes: 3

Related Questions