Muckeypuck
Muckeypuck

Reputation: 559

Regular Expression not capturing expected strings

The pattern:

000[0-9]Text(.*)?000[0-9]Font

ISystem, Version=2.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089
-System.Collections.Specialized.ListDictionary head version count comparer
System.Collections.IComparer
0001 Text QSystem.Drawing, Version=2.0.0.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a
0001Center System.Drawing.PointF x y \ ˆ=mÀÉ@ 0001Expansion System.Drawing.SizeF width height Ì6 AûY? 0001Color
System.Drawing.Color name value knownColor state   ÿ
0001Text YPARTIAL RELEASE IN 09/2573, 09/2835, 09/2908, 11/86, 11/1741, 11/1803, 11/2484, 11/2765 0001Font
System.Drawing.Font Name Size Style Unit
System.Drawing.FontStyle System.Drawing.GraphicsUnit
Microsoft Sans Serif @A èÿÿÿ System.Drawing.FontStyle value__
çÿÿÿ System.Drawing.GraphicsUnit value__
0001Charset 0001TextOrientation
0002 ! 0002Center " # " )\ >{ Î@ #
$ 0002Expansion % & % )\@Ãõ(? & ' 0002Color ( ) (   ÿ ) * 0002Text + 12/452, 12/1629 , , - 0002Font . / . 0 Microsoft Sans Serif @A Ïÿÿÿèÿÿÿ Îÿÿÿçÿÿÿ / 3 0002Charset
4 4 5 0002TextOrientation 6 6 7 0003
9 9 : 0003Center ; < ; ö(\¿ …ËÀ < =
0003Expansion > ? > …ë @ ? ? @ 0003Color A
B A % B C 0003Text D SATISFACTION IN 13/25 E E F 0003Font G H G I
Microsoft Sans Serif @A ¶ÿÿÿèÿÿÿ µÿÿÿçÿÿÿ H L
0003Charset M M N 0003TextOrientation

What i am expecting:

In the .NET regex engine it returns null.

What I get in Expresso:

 YPARTIAL RELEASE IN  09/2573, 09/2835, 09/2908, 11/86,  11/1741, 11/1803, 11/2484, 11/2765                    0001Font                System.Drawing.Font     Name Size Style Unit      System.Drawing.FontStyle     System.Drawing.GraphicsUnit              Microsoft Sans Serif  @A èÿÿÿ System.Drawing.FontStyle     value__           çÿÿÿ System.Drawing.GraphicsUnit     value__                         0001Charset                          0001TextOrientation                          0002                    !    0002Center "    #    "       )\ >{ Î@ #        $    0002Expansion %    &    %       )\@Ãõ(? &        '    0002Color (    )    (           ÿ         )        *    0002Text +    12/452,  12/1629 ,    ,        -    0002Font .    /    .        0    Microsoft Sans Serif  @A Ïÿÿÿèÿÿÿ     Îÿÿÿçÿÿÿ     /        3    0002Charset       4    4        5    0002TextOrientation       6    6        7    0003      9    9        :    0003Center ;    <    ;       ö(\¿ …ËÀ <        =    0003Expansion >    ?    >       …ë @   ? ?        @    0003Color A    B    A                %    B        C    0003Text D    SATISFACTION  IN  13/25 E    E        F    

C# code

Regex reg = new Regex(@"000\dText(.*)?000\dFont", RegexOptions.Multiline);
foreach (Match match in reg.Matches(test))
{
     if(match.Groups[1].Captures[0].Value.IndexOf("System") < 0)
        details += match.Groups[1].Captures[0].Value.Trim() + "\r\n";
}

Upvotes: 0

Views: 118

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626826

Use

@"000[0-9]Text\b(.*?)000[0-9]Font\b"

See the regex demo

Two points:

  • The (.*)? is a greedy matching pattern that matches up to the last 000 + digit, but optional (?, one or zero). You need to change to the lazy dot matching, .*?.
  • The Text should be followed with a word boundary, you may match the 0001TextOrientation block.

enter image description here

Upvotes: 1

Related Questions