Reputation: 1613
I have this text (taken from concatenated field row)
Astronomic Event 2013/1434H - Aceh ....
How do We search it by 2013
or 1434h
keywords?
I have tried below code but it resulting no row.
to_tsvector
result:
'2013/1434h':8,12 'aceh':1 'bin.....
Sample Case:
WITH sample_table as
(SELECT to_tsvector('Astronomic Event 2013/1434H - Aceh') sample_content)
SELECT *
FROM sample_table, to_tsquery('2013') q
WHERE sample_content @@ q
Upvotes: 1
Views: 721
Reputation: 78503
How do We search it by 2013 or 1434h keywords?
It seems like you want to replace:
to_tsquery('1434h') q
with:
to_tsquery('1434h | 2013') q
http://www.postgresql.org/docs/current/static/functions-textsearch.html
Side note: the to_tsquery()
syntax is extremely capricious. It doesn't allow for much if any fantasy, and many of the assumptions in Postgres are everything but end-user friendly.
More often than not, you'll be better off using plainto_tsquery()
, which allows any amount of garbage to be thrown at it. Thus, consider pre-processing the string before issuing the query. For instance, you could split the string, and OR
the original parts together:
where sc.text_index @@ (plainto_tsquery('1434h') || plainto_tsquery('2013'))
Doing so will make your code a bit more complex, but it won't rely on your users needing to understand that (contrary to what they're accustomed to in Google) they should enter 'quick & brown & fox & jumps & lazy & dog'
instead of plain 'The quick brown fox jumps over the lazy dog'
.
Edit: I ended up actually trying your sample query, and it seems you're actually running into a parser issue:
# SELECT alias, description, token FROM ts_debug('Astronomic Event 2013/1434H - Aceh');
alias | description | token
-----------+-------------------+------------
asciiword | Word, all ASCII | Astronomic
blank | Space symbols |
asciiword | Word, all ASCII | Event
blank | Space symbols |
file | File or path name | 2013/1434H
blank | Space symbols |
blank | Space symbols | -
asciiword | Word, all ASCII | Aceh
(8 rows)
http://www.postgresql.org/docs/current/static/textsearch-parsers.html
It looks like you might need to write (or find) and configure an app-specific parser. Having never done this personally, the best I can do is to highlight that Postgres allows this and includes a sample:
http://www.postgresql.org/docs/current/static/test-parser.html
Alternatively, change your tsvector-related trigger so that it matches e.g. \d{4}/\d+[a-zA-Z]
or whatever seems most appropriate, and adds spaces accordingly, before converting it to a tsvector. Something as simple as the following might do the trick if you never need to store file names:
SELECT alias, description, token
FROM ts_debug(replace('Astronomic Event 2013/1434H - Aceh', '/', ' / '));
Upvotes: 1