Rob Segal
Rob Segal

Reputation: 7625

Regular expression matching more than I need it to

I'm having some trouble trying to develop a regular expression which will pick out all the function calls to "tr" from this block of asp code below. Specifically I need to get the string in each "tr" function call.

    if(RS.Fields("Audid").Value <> 0 ) Then
        Response.Write ("<td>" & tr("RA Assigned") & "</td>")
    else
        Response.Write ("<td>" & tr("Not Yet Assigned") & "</td>")
    End if

    if(RS.Fields("rStatus").Value = "Activated") then
        Response.Write("<td><A HRef='portal_setup_billingII.asp?OrderPId=" & RS.Fields("CustomerParid").Value & "&OrderId=" & RS.Fields("OrderId").Value & "'>" & tr("Edit") &"</A></td></TR>")
    Else
        If (gParLevelz_Admin = gParLevelz and RS.Fields("CustomerParid").Value <> 0) Then 
            Response.Write("<td><A HRef='portal_setup_billingII.asp?OrderPId=" & RS.Fields("CustomerParid").Value & "&OrderId=" & RS.Fields("OrderId").Value & "'>" & tr("Awaiting Authorization") & "</A></td></TR>")
        else                                       
            Response.Write("<td>" & tr("Awaiting Authorization") & "</td></TR>")
        End if
    End if

I believe I have a good first attempt at getting this done. The following expression extracts values for most of the cases I will run into...

tr\(\"([^%]|%[0-9]+)+\"\)

What's causing me the most confusion and stress is how to capture all manner of strings which show up in the "tr" function. Literally anything could be between the quotation marks of the "tr" call and unfortunately my expression returns values past that last quotation. So given the above snippet which I have posted one of the matches is...

tr("RA Assigned %2") & "</td>")
            else
                Response.Write ("<td>" & tr("Not Yet Assigned %4") & "</td>")
            End if

            if(RS.Fields("rStatus").Value = "Activated") then
                Response.Write("<td><A HRef='portal_setup_billingII.asp?OrderPId=" & RS.Fields("CustomerParid").Value & "&OrderId=" & RS.Fields("OrderId").Value & "'>" & tr("Edit") &"</A></td></TR>")
            Else
                If (gParLevelz_Admin = gParLevelz and RS.Fields("CustomerParid").Value <> 0) Then 
                    Response.Write("<td><A HRef='portal_setup_billingII.asp?OrderPId=" & RS.Fields("CustomerParid").Value & "&OrderId=" & RS.Fields("OrderId").Value & "'>" & tr("Awaiting Authorization") & "</A></td></TR>")
                else                                       
                    Response.Write("<td>" & tr("Awaiting Authorization") & "</td></TR>")

Which is way more than I want. I just want tr("RA Assigned %2") to be returned.

Upvotes: 2

Views: 193

Answers (8)

Ariel
Ariel

Reputation: 5830

This should do it, use non-greedy (?) after * or +:

    const string pattern = "tr\\(\".*?\"\\)";
    const string text = "tr(\"RA Assigned %2\") & \"</td>\")";
    Regex r = new Regex(pattern, RegexOptions.Compiled);
    Match m = r.Match(text);
    while (m.Success)
    {
        foreach (Capture c in m.Captures)
        {
            Console.WriteLine(c.Value);
        }
        m = m.NextMatch();
    }

(Here there is a good regex in C# cheat sheet)

Upvotes: 0

Christopher Bruns
Christopher Bruns

Reputation: 9518

tr\((\"[^\"]*)\"\)

Upvotes: 1

Nathan Taylor
Nathan Taylor

Reputation: 24606

I'm not sure if it's perfect, but it properly retrieved all of the entries in your sample. While testing the other expressions on this page I found that some erroneous entries were being returned. This one does not return any bad data:

tr\("([\W\w\s]+?)"\)

The result returned will contain both the entire function call, and also the strings within the function. I tested it with the following input:

Response.Write ("<td>" & tr("RA Assigned") & "</td>")
Response.Write ("<td>" & tr("Not Yet Assigned") & "</td>")
Response.Write("<td><A HRef='portal_setup_billingII.asp?OrderPId=" & RS.Fields("CustomerParid").Value & "&OrderId=" & RS.Fields("OrderId").Value & "'>" & tr("Edit") &"</A></td></TR>")
Response.Write("<td><A HRef='portal_setup_billingII.asp?OrderPId=" & RS.Fields("CustomerParid").Value & "&OrderId=" & RS.Fields("OrderId").Value & "'>" & tr("Awaiting Authorization") & "</A></td></TR>")                              
Response.Write("<td>" & tr("Awaiting Authorization") & "</td></TR>")
Response.Write ("<td>" & tr("RA Ass14151igned") & "</td>")
Response.Write ("<td>" & tr("RA %Ass_!igned") & "</td>")

And received the following output:

$matches Array:
(
    [0] => Array
        (
            [0] => tr("RA Assigned")
            [1] => tr("Not Yet Assigned")
            [2] => tr("Edit")
            [3] => tr("Awaiting Authorization")
            [4] => tr("Awaiting Authorization")
            [5] => tr("RA Ass14151igned")
            [6] => tr("RA %Ass_!igned")
        )

    [1] => Array
        (
            [0] => RA Assigned
            [1] => Not Yet Assigned
            [2] => Edit
            [3] => Awaiting Authorization
            [4] => Awaiting Authorization
            [5] => RA Ass14151igned
            [6] => RA %Ass_!igned
        )

)

On a related note, check out My Regex Tester. It's a super useful tool for testing regular expressions in your browser.

Upvotes: 0

Ahmad Mageed
Ahmad Mageed

Reputation: 96537

It looks like your regex pattern is greedy. Try making it non-greedy by adding an ? after the 2nd +: tr\(\"([^%]|%[0-9]+)+?\"\)

A simplified version to capture anything inside the tr(...) would be: tr\(\"(.+?)\"\)

Upvotes: 4

aDev
aDev

Reputation: 314

Just don't match on the equals sign for the string.

tr\(\"([^\"]+)\"\)

Upvotes: 0

zendrums
zendrums

Reputation: 1

tr(\".*\")

in regex, . = anything, * = any number (including 0)

Upvotes: 0

Rubens Farias
Rubens Farias

Reputation: 57976

You'll need a non-greedy pattern; just add a ?, like:

tr\(\"([^%]|%[0-9]+)+?\"\)
//                   ^--- notice this

Upvotes: 1

Anonymous
Anonymous

Reputation: 50379

Use a question mark after the plus sign modifier to make it non-greedy (only match as much as it needs).

Also, maybe anchor against ") & " if that always follows a call to tr().

Upvotes: 1

Related Questions