user2561842
user2561842

Reputation: 21

Regex for Finding a Suffix, but do not capture from Right to Left

Newbie here. Trying to pull a value from the left, only when a 5-digit number is found, but not captured, on the right. Any direction would be appreciated.

Example:

Hello Industries                         12345

I need to find the 5-digit number, then grab the company name.

Upvotes: 2

Views: 114

Answers (3)

Louis Ricci
Louis Ricci

Reputation: 21086

Use matching groups.

using System;
using System.Text.RegularExpressions;
public class Test
{
  public static void Main()
  {
    string test = "Hello Industries 12345 Another One 54321";
    var matches = Regex.Matches(test, @"(?<=(\d{5}\s+|^))(?<NAME>.*?)\s+(?<NUMBER>\d{5})");
    foreach(Match m in matches)
    {
      Console.WriteLine(string.Format("Name: {0} #: {1}", 
        m.Groups["NAME"].Value, 
        m.Groups["NUMBER"].Value));
    }
  }
}

Upvotes: 1

Rory O&#39;Kane
Rory O&#39;Kane

Reputation: 30398

Use lookahead, (?=), to find something but not capture it.

.+(?=\s+\d{5})

You can see that this regex works using this online tool.

Upvotes: 4

JDB
JDB

Reputation: 25835

Use a zero-width positive lookahead assertion to find some content which appears before an expression. The expression itself will not be capture (thus the name "zero-width").

(\w+)(?=\s+\d{5})

This will find a word ([a-zA-Z0-9_], plus some unicode stuff) which appears before a 5 digit number.

I would guess based on your formatting that you have a newline separated list of customers appended with the customer id. If that's the case, you can use the following pattern in combination with the Multiline option to find a particular customer:

^.+(?=\s+12345)

If you are trying to extract customer names from a document where the customer name is followed by the 5-digit customer ID, then you could use the following (assuming the customer name is capitalized):

([\p{Lu}\p{Lt}\p{Lo}]\w*\s+)+(?=\d{5})

This will find one or more words beginning with an upper case, title case or "other" case character (excluding lower case).

Upvotes: 0

Related Questions