esims
esims

Reputation: 420

Regular expression for text with multiple character delimiter

I have a text file like this:

::content1 ...
...
::content2 ...
...
::content3 ...
...

So the text (multiple line) is delimited by "::" and I want to find the matches in between. Perhaps the best way is to replace "::" with one character and then use the split method in VB.Net. But I wonder if this can be done also with a regular expression, like this:

Dim myRegex as RegEx = new Regex("...")         
m = myRegex.Match(content)
Do While m.Success
    ...
    m = m.NextMatch()           
Loop

It seemed easy but I can't find the right regular expression pattern.

Edit: Someone asked what I had already tried. I have tried negative lookahead, but it does not work. (I use the [\S\s] instead of ".", because I understand the period does not match a new line character. Maybe this is my poor understanding of regular expressions.)

    Dim myRegex as RegEx = new Regex("::(?![\S\s]+::)[\S\s]+")          

Upvotes: 2

Views: 2258

Answers (2)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 627607

Note your ::(?![\S\s]+::)[\S\s]+ pattern matches :: that are not followed with any 1+ chars ([\s\S]+) followed with ::, and then matches any 1+ chars. That makes the pattern find the last :: and what is after it.

Note that in this case, you might really do without a regex if the :: always delimit the contents. Here is a way to do it:

Dim result As String() = str.Split(New String() {"::"}, StringSplitOptions.RemoveEmptyEntries)

If you want to split with :: that are at the beginning of lines, you might want to use a regex like

Dim result As List(Of String) = Regex.Split(str, "^::", RegexOptions.Multiline).Where(
        Function(m) String.Equals(m.Trim(), String.Empty) = False).ToList()

where RegexOptions.Multiline will make ^ match the start of a line, not the start of a string.

Upvotes: 1

Rjun
Rjun

Reputation: 406

No need to use regular expression. Split the text by ::.

content.Split("::")

Upvotes: 0

Related Questions