Adam Mrozek
Adam Mrozek

Reputation: 1490

Regex match text not proceded by quotation mark (ignore whitespaces)

I have following text:

SELECT 
    U_ArrObjJson(
        s."Description", s."DateStart", sp.*
    ) as "Result" 
FROM "Supplier" s 
OUTER APPLY( 
    SELECT 
        U_ArrObjJson,
        'U_ArrObjJson(',
'                                             <- THE PROBLEM IS HERE
        U_ArrObjJson(
            p."Id", p."Description", p."Price"
        ) as "Products" 
    FROM "Products" p 
    WHERE p."SupplierId" = s."Id" 
) sp 

What I need to do is find instances of U_ArrObjJson function which are not proceded quotation mark. I end up with following expression:

(?<!\')\bU_ArrObjJson\b[\n\r\s]*[\(]+

The problem is that the last occurence of U_ArrObjJson is proceded by single quotation mark but there are spaces and new lines indicators between quotation mark and instance of name I looking for.

This expression I need to use with dotnet Regex in my method:

var matches = new Regex(@"(?<!\')\bU_ArrObjJson\b[\n\r\s]*[\(]+", RegexOptions.IgnoreCase | RegexOptions.CultureInvariant).Matches(template);

How can I modify my expression to ignore preceded spaces?

Upvotes: 0

Views: 70

Answers (1)

41686d6564
41686d6564

Reputation: 19661

Since .NET's regex supports non-fixed width Lookbehinds, you can just add \s* to the Lookbehind:

(?<!\'\s*)\bU_ArrObjJson\s*\(+

Demo.

Notes:

  • [\n\r\s] can be replaced with just \s here because the latter matches any whitespace character (including EOL). So, \n\r is redundant here.

  • As indicated by Wiktor Stribiżew in the comments, the second \b is also redundant because the function name will either be followed by a whitespace or a ( character. In both cases, a word boundary is implicitly required.

  • Unless you actually want to match the function name followed by multiple ( characters, you probably should also remove the + at the end.

Upvotes: 1

Related Questions