Sergio Romero
Sergio Romero

Reputation: 6597

C# How to combine multiple regex patterns in a single one without caring about the order

This is a sample of the text that needs to be parsed:

============================================================================================================================================================
line table (detailed)
============================================================================================================================================================

------------------------------------------------------------------------------------------------------------------------------------------------------------
line
------------------------------------------------------------------------------------------------------------------------------------------------------------
              if-index : 1/1/4/1                            rel-cap-occ-up : 60                                noise-margin-up : 165
     output-power-down : 92                             sig-attenuation-up : 32                            loop-attenuation-up : 30
         actual-opmode : g993-2-8d                            xtu-c-opmode : 00:00:00:00:00:00:00:00:48:00:00:00:00:00:00:00
            ansi-t1413 : dis-ansi-t1413                           etsi-dts : dis-etsi-dts                             g992-1-a : dis-g992-1-a
              g992-1-b : dis-g992-1-b                             g992-2-a : dis-g992-2-a                             g992-3-a : dis-g992-3-a
              g992-3-b : dis-g992-3-b                            g992-3-aj : dis-g992-3-aj                           g992-3-l1 : dis-g992-3-l1
             g992-3-l2 : dis-g992-3-l2                           g992-3-am : dis-g992-3-am                            g992-5-a : dis-g992-5-a
              g992-5-b : dis-g992-5-b                          ansi-t1.424 : dis-ansi-t1.424                           etsi-ts : dis-etsi-ts
            itu-g993-1 : dis-itu-g993-1                       ieee-802.3ah : dis-ieee-802.3ah                        g992-5-aj : dis-g992-5-aj
             g992-5-am : dis-g992-5-am                           g993-2-8a : dis-g993-2-8a                           g993-2-8b : dis-g993-2-8b
             g993-2-8c : dis-g993-2-8c                           g993-2-8d : g993-2-8d                              g993-2-12a : dis-g993-2-12a
            g993-2-12b : dis-g993-2-12b                         g993-2-17a : g993-2-17a                             g993-2-30a : dis-g993-2-30a
       actual-psd-down : -586                             power-mgnt-state : l0
     per-bnd-lp-att-up : 00:08:00:21:04:f6:04:f6:04:f6
     pr-bnd-sgn-att-up : 04:f6:00:21:04:f6:04:f6:04:f6
     pr-bnd-nois-mg-up : 02:76:00:a5:02:76:02:76:02:76                                                            high-freq-up : 5197
          elect-length : 3                                   time-adv-corr : -902                           actual-tps-tc-mode : ptm
     actual-ra-mode-up : automatic                           vect-cpe-type : legacy
============================================================================================================================================================

As an example I have the following patterns to get the values of three variables:

const string pattern1 = @"itu-g993-1[\s]{0,1}:[\s]{0,1}(?<itu_g993_1>.*?(?=\s))";
const string pattern2 = @"time-adv-corr[\s]{0,1}:[\s]{0,1}(?<time_adv_corr>.*?(?=\s))";
const string pattern3 = @"xtu-c-opmode[\s]{0,1}:[\s]{0,1}(?<xtu_c_opmode>.*?(?=\s))";

By themselves they work fine.

My questions would be:

  1. How to combine these three patterns in a single call to Regex.Match so I can get all three results?
  2. Is there a performance advantage or disadvantage by doing this in one call or multiple calls to the Regex.Match method?

The reason why I am asking this question is that our requirements are still fuzzy and we do not know exactly which and how many of these variables we will need to extract.

Upvotes: 0

Views: 3844

Answers (2)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626845

If these values are always all present, you can use capturing groups inside positive look-aheads and the following regular expression:

(?s)^(?=.*itu-g993-1\s?:\s?(?<itu>\S*))(?=.*time-adv-corr\s?:\s?(?<time>\S*))(?=.*xtu-c-opmode\s?:\s?(?<xtu>\S*))

You may test it at regexstorm.net.

Even though look-arounds do not consume text, the text itself can be captured into groups which is useful in case we do not really need the match, but just a piece of text.

Note that positive look-aheads require the patterns to match the substring, so if there is no xtu-c-opmode, but itu-g993-1 and time-adv-corr are, there will be no match, and no captured groups either.

Upvotes: 1

user557597
user557597

Reputation:

Because, it just finds it as it goes left to right,
you could just join them together using an alternation.

edit: If the regex's are dependent on a case by case basis,
you could always make a function that creates the full regex by
joining the individual ones (with alternation) based on a passed in bitmask.
This way you have a central place to store and manage all the individual regex.

string Lines =
@"
              if-index : 1/1/4/1                            rel-cap-occ-up : 60                                noise-margin-up : 165
     output-power-down : 92                             sig-attenuation-up : 32                            loop-attenuation-up : 30
         actual-opmode : g993-2-8d                            xtu-c-opmode : 00:00:00:00:00:00:00:00:48:00:00:00:00:00:00:00
            ansi-t1413 : dis-ansi-t1413                           etsi-dts : dis-etsi-dts                             g992-1-a : dis-g992-1-a
              g992-1-b : dis-g992-1-b                             g992-2-a : dis-g992-2-a                             g992-3-a : dis-g992-3-a
              g992-3-b : dis-g992-3-b                            g992-3-aj : dis-g992-3-aj                           g992-3-l1 : dis-g992-3-l1
             g992-3-l2 : dis-g992-3-l2                           g992-3-am : dis-g992-3-am                            g992-5-a : dis-g992-5-a
              g992-5-b : dis-g992-5-b                          ansi-t1.424 : dis-ansi-t1.424                           etsi-ts : dis-etsi-ts
            itu-g993-1 : dis-itu-g993-1                       ieee-802.3ah : dis-ieee-802.3ah                        g992-5-aj : dis-g992-5-aj
             g992-5-am : dis-g992-5-am                           g993-2-8a : dis-g993-2-8a                           g993-2-8b : dis-g993-2-8b
             g993-2-8c : dis-g993-2-8c                           g993-2-8d : g993-2-8d                              g993-2-12a : dis-g993-2-12a
            g993-2-12b : dis-g993-2-12b                         g993-2-17a : g993-2-17a                             g993-2-30a : dis-g993-2-30a
       actual-psd-down : -586                             power-mgnt-state : l0
     per-bnd-lp-att-up : 00:08:00:21:04:f6:04:f6:04:f6
     pr-bnd-sgn-att-up : 04:f6:00:21:04:f6:04:f6:04:f6
     pr-bnd-nois-mg-up : 02:76:00:a5:02:76:02:76:02:76                                                            high-freq-up : 5197
          elect-length : 3                                   time-adv-corr : -902                           actual-tps-tc-mode : ptm
     actual-ra-mode-up : automatic                           vect-cpe-type : legacy
";
Regex RxData = new Regex(
              @"
                  itu-g993-1[\s]{0,1}:[\s]{0,1}(?<itu_g993_1>.*?(?=\s))
                | time-adv-corr[\s]{0,1}:[\s]{0,1}(?<time_adv_corr>.*?(?=\s))
                | xtu-c-opmode[\s]{0,1}:[\s]{0,1}(?<xtu_c_opmode>.*?(?=\s))
              ", RegexOptions.IgnorePatternWhitespace );

Match _mData = RxData.Match( Lines );
while (_mData.Success)
{
    if (_mData.Groups["itu_g993_1"].Success )
        Console.WriteLine("itu_g993_1 =  {0} \r\n", _mData.Groups["itu_g993_1"].Value);
    if (_mData.Groups["time_adv_corr"].Success)
        Console.WriteLine("time_adv_corr =  {0} \r\n", _mData.Groups["time_adv_corr"].Value);
    if (_mData.Groups["xtu_c_opmode"].Success)
        Console.WriteLine("xtu_c_opmode =  {0} \r\n", _mData.Groups["xtu_c_opmode"].Value);

    _mData = _mData.NextMatch();
}

Output

xtu_c_opmode =  00:00:00:00:00:00:00:00:48:00:00:00:00:00:00:00

itu_g993_1 =  dis-itu-g993-1

time_adv_corr =  -902

Upvotes: 1

Related Questions