SumOfDavid
SumOfDavid

Reputation: 281

How can I find code with a specific string, but not in comments

I'm trying to search through 1,000's of stored procedures and functions for a particular string (in this case an email address), but I want to ignore it when it's in a comment block.

This is the SQL syntax to find the objects, but there are hundreds of results and I don't want to go through each one to determine if the email address is being utilized in code or just in a comment block.

SELECT DISTINCT
  o.name objectname
FROM sys.sql_modules m
  INNER JOIN sys.objects o 
    ON m.object_id = o.object_id
WHERE definition LIKE '%[email protected]%'
ORDER BY o.name

I have a CLR library (SQLSharp) where I can use regex, but I'm not sure of the regex expression that would identify a comment vs valid code.

For those who are RegEx savvy and not SQL Server savvy, a comment is defined as a single line starting with two dashes (e.g. --) or multiple lines starting with /* and ending with */

Just the thought of trying to figure this out makes my brain hurt.

Upvotes: 6

Views: 2180

Answers (3)

Solomon Rutzky
Solomon Rutzky

Reputation: 48826

This can be done using the code I posted in an article on SQL Server Central:

Searching Objects for Text

The Stored Procedure described in that article has the ability to not only ignore both inline (--) and block (/* ... */) comments, but it can also ignore text within single quotes. The code defaults the option to ignore comments to True as it seems unlikely to want to search for non-impacting text. But the default for ignoring strings is False as dynamic SQL and some other text that can go into a string might still be valid, executable code in the end.

The code was developed for SQL Server 2000 and hence does not make use of NVARCHAR(MAX) in 1 or 2 places that would really help. I have since updated the code to use NVARCHAR(MAX) instead of NVARCHAR(4000) and hence a single temp table, but have not updated the article with those changes. I will try to do that one of these days. And I believe the code in the article looks at one database whereas the updated code will look through all accessible DBs if one is not specified.

Upvotes: 2

Lucero
Lucero

Reputation: 60190

You could use the parser component of my ModuleStore project, which creates a syntax tree from the SQL you pass in. This tree can then be queried and modified programatically.

http://code.google.com/p/bsn-modulestore/

Upvotes: 0

Pavan Manjunath
Pavan Manjunath

Reputation: 28525

You can use the following regex to identify both kinds of comments

(--.*?\n)|(/\*.*?\*/)

Make sure . matches newline too to identify multi line comments

Note that, on some systems, \r is treated as line break character and on some a combination of \r\n is the line break character.

Upvotes: 0

Related Questions