Ab Laxman
Ab Laxman

Reputation: 55

How do I get the exact position of regex matches?

For example assume I have a text string

What is the value of pn in 1 ;/
This is a test 12./ lop

I want to get the exact line position of the regex matches for the regex pattern \d\s?[.,;:]\s?/. How can I do that I've tried

string text = @"What is the value of pn in 1 ;/
This is a test 12./ lop";
    string pattern = @"\d\s?[.,;:]\s?/";
    foreach (Match m in Regex.Matches(text, pattern))
    {
        var info=LineFromPos(text,m.Index);
        Console.WriteLine(info+","+m.Index);
    }

    Console.Read();
}
public static int LineFromPos(string S, int Pos)
{
    int Res = 1;
    for (int i = 0; i <= Pos - 1; i++)
        if (S[i] == '\n') Res++;
    return Res;
}

But the code outputs

1,27
2,49

Where it should be

1,27
2,16

How do I fix this?

Upvotes: 0

Views: 120

Answers (2)

Jon Skeet
Jon Skeet

Reputation: 1500675

You're currently treating m.Index as if it's the position in the line, but it's actually the position in the string it sounds like you may want to write a method to convert from a string index into a position (both line and index within line) - assuming you want to keep the matches within a single string.

For example (using ValueTuple and C# 7 syntax - you could create your own line/column type otherwise):

static (int line, int column) FindPosition(string text, int index)
{
     int line = 0;
     int current = 0;
     while (true)
     {
         int next = text.IndexOf('\n', current);
         if (next > index || next == -1)
         {
             return (line, index - current);
         }
         current = next + 1;
         line++;
     }
}

We could be more efficient than that by remembering the position of the previous match, but it's simpler to keep it as just accepting the string and index.

Here's a complete example of that in your code:

using System;
using System.Text.RegularExpressions;

static class Int32Extensions
{
    // This doesn't do what you might expect it to!
    public static void Increment(this int x)
    {
        x = x + 1;
    }
}

class Test
{
    static void Main()
    {
        string text = @"What is the value of pn in 1 ;/
This is a test 12./ lop";
        string pattern = @"\d\s?[.,;:]\s?/";
        foreach (Match m in Regex.Matches(text, pattern))
        {
            var position = FindPosition(text, m.Index);
            Console.WriteLine($"{position.line}, {position.column}");
        }
    }


    static (int line, int column) FindPosition(string text, int index)
    {
         int line = 0;
         int current = 0;
         while (true)
         {
             int next = text.IndexOf('\n', current);
             if (next > index || next == -1)
             {
                 return (line, index - current);
             }
             current = next + 1;
             line++;
         }
    }
}

That prints output of:

0, 27
1, 16

That's using 0-based line and column numbers - obviously you can add 1 when you display the values if you want to.

Upvotes: 0

Oleksii Klipilin
Oleksii Klipilin

Reputation: 1936

You can try something like this:

string text = @"What is the value of pn in 1 ;/
This is a test 12./ lop";
string pattern = @"\d\s?[.,;:]\s?/";

var lines = Regex.Split(text, "\r\n|\r|\n").Where(s => s != String.Empty)
    .ToList();
for (int i = 0; i < lines.Count; i++)
{
    foreach (Match m in Regex.Matches(lines[i], pattern))
    {
        Console.WriteLine(string.Format("{0},{1}", i + 1, m.Index));
    }
}

Upvotes: 1

Related Questions