Fuji - H2O
Fuji - H2O

Reputation: 387

Reg Ex to extract 2 texts from a variable

I need 2 separate regex (.net, c#) to extract values. I can't use grouping. Just need 2 separate Regex. I tried this (?:^|(?:[.!?]\s))(\w+) but it gives me 6;. Supplier

For example

6. Supplier - Compressors, Drivers & Refrig. Units

1st regex will give me Supplier 2nd regex will give me Compressors, Drivers & Refrig. Units

example variables are (they are not together like this in my scenario). They are separate instance. The reason I need to separate Regex is to evaluate this value at different position in the code.

4. Cost/Estimating
6. Supplier - Minor Material: Specialty
6. Supplier - Pressure Vessels & Filters
6. Supplier - Pumps
6. Supplier - Minor Material: Valves
6. Supplier - Minor Material: Specialty
6. Supplier - Other Major Equipment
7. Manufacturing
8. Project Management
9. Order Release (OTR) - Commercial Documents
9. Order Release (OTR) - Hand-Off

Upvotes: 0

Views: 135

Answers (2)

Riv
Riv

Reputation: 1859

You can use one regex with grouping:

    string myString = "6. Supplier - Minor Material: Specialty";
    var regexPattern = @"(?<number>\d+)\.\s(?<supplier>\w+)\s-\s(?<product>.+)";
    var matched = Regex.Match(myString, regexPattern);
    if (matched.Success)
    {
        var supplier = matched.Groups["supplier"].Value;
        var product = matched.Groups["product"].Value;
    }

Based on your requirements for Nintex, you could try the following two regular expressions, though they're far from ideal:

Matches first word after number (with lookahead and lookbehind):

(?<=\d\.\s).+(?=\s-\s)

Matches words after dash - (with lookbehind)

(?<=\s-\s).+

Upvotes: 2

ctwheels
ctwheels

Reputation: 22837

Code

See regex in use here

^\d+\.[ \t]*([^-\r\n]+?)(?:$|[ \t]*-[ \t]*(.*))

Results

Input

4. Cost/Estimating
6. Supplier - Minor Material: Specialty
6. Supplier - Pressure Vessels & Filters
6. Supplier - Pumps
6. Supplier - Minor Material: Valves
6. Supplier - Minor Material: Specialty
6. Supplier - Other Major Equipment
7. Manufacturing
8. Project Management
9. Order Release (OTR) - Commercial Documents
9. Order Release (OTR) - Hand-Off
6. Supplier - Compressors, Drivers & Refrig. Units

Output

Below output shows capture group 1 separated by the optional capture group 2. Each complete match separated by a new line.

Cost/Estimating

Supplier
Minor Material: Specialty

Supplier
Pressure Vessels & Filters

Supplier
Pumps

Supplier
Minor Material: Valves

Supplier
Minor Material: Specialty

Supplier
Other Major Equipment

Manufacturing

Project Management

Order Release (OTR)
Commercial Documents

Order Release (OTR)
Hand-Off

Supplier
Compressors, Drivers & Refrig. Units

Explanation

Flags g (global - don't stop at first match) and m (multi-line - ^ matches the start of each line and $ matches the end of each line) are used.

  • ^ Assert position at the start of the line
  • \d+ Match one or more digits
  • \. Match the dot character . literally
  • [ \t]* Match any number of spaces or tabs
  • ([^-\r\n]+?) Capture one or more characters not present in the set -\r\n, but as few as possible, into capture group 1
  • (?:$|[ \t]*-[ \t]*(.*)) Match either of the following
    • $ Assert position at the end of the line
    • [ \t]* Match any number of spaces or tabs
    • - Match the hyphen character - literally
    • [ \t]* Match any number of spaces or tabs
    • (.*) Capture any character (except for line terminators) any number of times into capture group 2

Upvotes: 0

Related Questions