Reputation: 5805
I don't really understand the difference between to_tsvector
and ::tsvector
in Postgres. I read the documentation on to_tsvector
here, but there doesn't seem to be any documentation for the other, ::tsvector
- that's a bit problematic. It is mentioned here, but it says something about normalizing before querying, and normalization is done through to_tsvector
.
I created this SQL Fiddle to demonstrate the two; here is the code if you don't want to navigate away:
DDL:
CREATE TABLE text (
text_id serial PRIMARY KEY,
source_text text NOT NULL,
destination_text text NOT NULL
);
and the SQL:
-- Throw some stuff in there
INSERT INTO text (source_text, destination_text) VALUES
('Hello', 'Hello Result'),
('With Comma, Query', 'WithComma, Result');
-- Forced to use punctuation in the query to match what is in the vector
SELECT T.source_text, T.destination_text
FROM text T
WHERE LOWER(T.source_text)::tsvector @@ LOWER('Hello')::tsquery;
-- Vector free of punctuation, don't include it in the query
SELECT T.source_text, T.destination_text
FROM text T
WHERE to_tsvector(LOWER(T.source_text)) @@ LOWER('Comma')::tsquery;
SELECT ts_debug('english', 'Something without a comma');
SELECT ts_debug('english', 'Something, with a comma');
It seems to me that to_tsvector
will take the text, strip it of punctuation and return the vector. On the other hand, ::tsvector
seems to include the punctuation in the vector, resulting in a need to use the same punctuation in the query.
What are the practical differences between the two? Is one generally preferred over the other? In what situations is each preferred?
Upvotes: 14
Views: 21695
Reputation: 28005
to_tsvector(text)
reads the string and does some normalization (taking language settings into account) on the string.
::tsvector
is a cast. It does not do any normalization (and does not care for language settings).
See: http://www.postgresql.org/docs/current/interactive/functions-textsearch.html
Some years ago a wrote an own to_tsvector() in python, since I was not happy with the way this was handled in postgres. This gave me more control.
To insert the data into the column I used the tsvector cast:
'UPDATE myapp_mymodel SET content=%s::tsvector where id=%s', [tsvector, self.id])
Upvotes: 22