Reputation:
I have a table posts
:
CREATE TABLE posts (
id serial primary key,
content text
);
When a user submits a post, how can I compare his post with the others and find similar posts?
I'm looking for something like StackOverflow does with the "Similar Questions".
Upvotes: 2
Views: 492
Reputation: 656666
While Text Search is an option it is not meant for this type of search primarily. The typical use case would be to find words in a document based on dictionaries and stemming, not to compare whole documents.
I am sure StackOverflow has put some smarts into the similarity search, as this is not a trivial matter.
You can get halfway decent results with the similarity function and operators provided by the pg_trgm module:
SELECT content, similarity(content, 'grand new title asking foo') AS sim_score
FROM posts
WHERE content % 'grand new title asking foo'
ORDER BY 2 DESC, content;
Be sure to have a GiST index on content
for this.
But you'll probably have to do more. You could combine it with Text Search after identifying keywords in the new content ..
Upvotes: 5
Reputation: 48256
You need to use Full Text Search in Postgres.
http://www.postgresql.org/docs/9.1/static/textsearch-intro.html
Upvotes: 0