ideaboxer
ideaboxer

Reputation: 4101

PSQLException: ERROR: syntax error in tsquery

Which characters must be avoided to make sure PSQLException: ERROR: syntax error in tsquery will not occur? The documentation does not say anything about how to escape the search string: http://www.postgresql.org/docs/8.3/static/datatype-textsearch.html

Upvotes: 24

Views: 22325

Answers (2)

websearch_to_tsquery

Another good one to consider that I saw on the docs is websearch_to_tsquery: https://www.postgresql.org/docs/17/textsearch-controls.html It is like plainto_tsquery in that it doesn't throw errors, but allows further search constructs to be exposed rather than just & AND as plainto_tsquery:

websearch_to_tsquery creates a tsquery value from querytext using an alternative syntax in which simple unformatted text is a valid query. Unlike plainto_tsquery and phraseto_tsquery, it also recognizes certain operators. Moreover, this function will never raise syntax errors, which makes it possible to use raw user-supplied input for search. The following syntax is supported:

  • unquoted text: text not inside quote marks will be converted to terms separated by & operators, as if processed by plainto_tsquery.
  • "quoted text": text inside quote marks will be converted to terms separated by <-> operators, as if processed by phraseto_tsquery.
  • OR: the word “or” will be converted to the | operator.
  • -: a dash will be converted to the ! operator.

Safe prefix search

Unfortunately, like to_tsquery, websearch_to_tsquery doesn't seem to support :* prefix search which is what I wanted for autocomplete suggestions. Alas. How to append prefix match to tsquery in PostgreSQL gives the idea of something along:

SELECT (plainto_tsquery('english', 'black sheep')::text || ':*')::tsquery;

which does prefix search on the last term. I wonder if it can blow up for any input other than inputs that lead to the empty string. We can also handle the empty string in PostgreSQL with:

select regexp_replace(plainto_tsquery('english', 'black sheep')::text || ':*', '^..$', '')::tsquery;

this way we also catch plainto_tsquery inputs that lead to the empty string like punctuation.

https://www.reddit.com/r/rails/comments/1cmloa6/sanitizing_a_search_phrase_when_using_to_tsvector/ points to this Ruby extension: https://github.com/Casecommons/pg_search/blob/master/lib/pg_search/features/tsearch.rb#L101 that one might be able to extract ideas from.

Upvotes: 0

Chris Farmiloe
Chris Farmiloe

Reputation: 14185

Use quotes around your terms if you want them as phrases/verbatim or they contain characters used in the syntax:

select to_tsquery('"hello there" | hi');

Bear in mind that you shouldn't really have crazy characters in your terms, since they are not going to match anything in the tsvector.

The (non-token) characters recognized by the tsquery parser are: \0 (null), (, ), (whitespace), |, &, :, * and !. But how you tokenize your query should be based on how you have setup your dictionary. There are a great many other characters that you will likely not want in your query, not because they will cause a syntax error but because it means you are not tokenizing your query correctly.

Use the plainto_tsquery version if it's a simple AND query and you don't want to deal with creating the query manually.

Upvotes: 40

Related Questions