Nenad
Nenad

Reputation: 3556

Obtain the line number for matched pattern

I use this code to check if a string exist in a text file that I loaded into memory

foreach (Match m in Regex.Matches(haystack, needle))
    richTextBox1.Text += "\nFound @ " + m.Index;

The regex returns the positions where a match occurred but I want to know the line number?

Upvotes: 14

Views: 13038

Answers (4)

theDoke
theDoke

Reputation: 565

To do this I did the following...

  • Read file contents into buffer
  • Use regex to match all carriage returns in the file and note there index in a list of carriage returns

    private static List<CarriageReturn> _GetCarriageReturns( string data )
    {
        var carriageReturns = new List<CarriageReturn>();
    
        var carriageReturnRegex = new Regex( @"(?:([\n]+?))", RegexOptions.IgnoreCase | RegexOptions.Singleline );
        var carriageReturnMatches = carriageReturnRegex.Matches( data );
        if( carriageReturnMatches.Count > 0 )
        {
            carriageReturns.AddRange( carriageReturnMatches.Cast<Match>().Select( match => new CarriageReturn
            {
                Index = match.Groups[1].Index,
            } ).ToList() );
        }
    
        return carriageReturns;
    }
    
  • Use my regex on the file and for every match do something like this LineNumber = carriageReturns.Count( ret => ret.Index < match.Groups[1].Index ) + 1

So basically I count the carriage returns occurring before my match and add 1

Upvotes: 1

Nenad
Nenad

Reputation: 3556

The best solution would be to call a method that gets the line number only if a match occurs. This way the performance is not much affected if multiple files were checked and the regexp with \n will work. Found this method somewhere on stackoverflow:

    public int LineFromPos(string input, int indexPosition)
    {
        int lineNumber = 1;
        for (int i = 0; i < indexPosition; i++)
        {
            if (input[i] == '\n') lineNumber++;
        }
        return lineNumber;
    }

Upvotes: 13

MrFox
MrFox

Reputation: 5106

    foreach (Match m in Regex.Matches(haystack, needle))
    {
        int startLine = 1, endLine = 1;
        // You could make it to return false if this fails.
        // But lets assume the index is within text bounds.
        if (m.Index < haystack.Length)
        {
            for (int i = 0; i <= m.Index; i++)
                if (Environment.NewLine.Equals(haystack[i]))
                    startLine++;
            endLine = startLine;

            for (int i = m.Index; i <= (m.Index + needle.Length); i++)
                if (Environment.NewLine.Equals(haystack[i]))
                    endLine++;
        }

        richTextBox1.Text += string.Format(
"\nFound @ {0} Line {1} to {2}", m.Index, startLine, endLine);

Won't actually work if the needle crosses a line, but that's because the regex does not recognize that.

Edit maybe you can replace the endlines in the text with spaces and apply the regex there, this code would still work and if the needle falls over a line it would still be found:

Regex.Matches(haystack.Replace(Environment.NewLine, " "), needle)

Upvotes: 0

BrokenGlass
BrokenGlass

Reputation: 160862

You can split your text into lines first and apply your RegEx to each line - of course that doesn't work if needle contains a NewLine:

var lines = haystack.Split(new[] { Environment.NewLine }, StringSplitOptions.None);
for(int i=0; i <lines.Length; i++)
{
    foreach (Match m in Regex.Matches(lines[i], needle))
        richTextBox1.Text += string.Format("\nFound @ line {0}", i+1)
}

Upvotes: 6

Related Questions