Ronnie Overby
Ronnie Overby

Reputation: 46460

Concerns about SQL Server 2008 Full Text Search

I have built a T-SQL query like this:

DECLARE @search nvarchar(1000) = 'FORMSOF(INFLECTIONAL,hills) AND FORMSOF(INFLECTIONAL,print) AND FORMSOF(INFLECTIONAL,emergency)'

SELECT * FROM Tickets
WHERE ID IN (
                -- unioned subqueries using CONTAINSTABLE
                            ...
            )

The GUI for this search will be an aspx page with a single textbox where the user can search.

I plan to somehow construct the search term to be like the example above (@search).

I have some concerns, though:

Upvotes: 11

Views: 2215

Answers (4)

Dave
Dave

Reputation: 1234

All of our searches are on columns in the database that have predefined valid characters. Our search algorithm incorporates this with a regex that only allows these predefined characters. Because of this escaping in the search string is not needed. Our regex weeds out any injection attempts in the web code (asp & aspx). For standard comments from the users, we use escaping that changes all characters that may be used for harm in SQL, ASP, ASPX, & Javascript.
The TransStar site http://latranstar.tann.com/ is using an extended form of Soundex to search for street names, addresses and cities anywhere in Southern California. The Soundex by itself eliminates any need for anti-injection code since it operates only on alpha characters.

Upvotes: 0

Anthony Gatlin
Anthony Gatlin

Reputation: 4413

Dan, I like your SanitizeInput method. I refactored it to make it more compact and enhance performance a little.

 static string SanitizeInput(string searchPhrase, int maxLength)
        {
            Regex r = new Regex(@";|'|--|xp_|/\*|\*/", RegexOptions.Compiled);
            return r.Replace(searchPhrase.Substring(0, searchPhrase.Length > maxLength ? maxLength : searchPhrase.Length), " ");
        }

        static string SanitizeInput(string searchPhrase)
        {
            const int MAX_SEARCH_PHRASE_LENGTH = 200;
            return SanitizeInput(searchPhrase, MAX_SEARCH_PHRASE_LENGTH);
        }

I agree that FreeTextTable is too lightweight of a solution.

Upvotes: 2

Rabid
Rabid

Reputation: 3034

In your example, you have the @search variable already defined. As a rule of thumb, you shouldn't include dynamically concatenated text into raw SQL, due to the risk of injection. However, you can of course set the value of @search in the calling command object from your application. This completely negates the risk of injection attacks.

I would recommend construction of the search term in C#; passing the final search term in as a parameter like already mentioned.

As far as I recall, FREETEXTTABLE uses word breakers to completely decompose the search terms into their individual components. However, the FREETEXTTABLE operator automatically decomposes words into inflectional equivalents also, so you won't have to construct a complex CONTAINSTABLE operator if you decide to use it.

You could INNER JOIN the results of multiple FREETEXTTABLE queries to produce an equivalent AND result.

Upvotes: 0

Dan Dumitru
Dan Dumitru

Reputation: 5423

I recently used Full-Text Search, so I'll try to answer some of your questions.

• "I hate building sql dynamically because of the risk of injection. How can I guard against this?"

I used a sanitize method like this:

static string SanitizeInput(string searchPhrase)
    {
        if (searchPhrase.Length > 200)
            searchPhrase = searchPhrase.Substring(0, 200);

        searchPhrase = searchPhrase.Replace(";", " ");
        searchPhrase = searchPhrase.Replace("'", " ");
        searchPhrase = searchPhrase.Replace("--", " ");
        searchPhrase = searchPhrase.Replace("/*", " ");
        searchPhrase = searchPhrase.Replace("*/", " ");
        searchPhrase = searchPhrase.Replace("xp_", " ");

        return searchPhrase;
    }

• Should I use FREETEXTTABLE instead? Is there a way to make FREETEXT look for ALL words instead of ANY?

I did use FREETEXTTABLE, but I needed any of the words. As much as I've read about it (and I've read quite a bit), you have to use CONTAINSTABLE to search for ALL words, or different combinations. FREETEXTTABLE seems to be the lighter solution, but not the one to pick when you want deeper customizations.

Upvotes: 2

Related Questions