Reputation: 73
I am trying to identify observations from my data using a list of keywords.However, the search results contains observations where only part of the keyword matches. For instance the keyword ice
returns varices
. I am using the following code
select *
from mytab
WHERE myvar similar to'%((ice)|(cool))%';
I tried the _tsquery
and it does the exact match and does not include observations with varices
. But this approach is taking significantly longer to query. (2 keyword search for similar to '% %'
takes 5 secs, whereas _tsquery
takes 30 secs for 1 keyword search.I have more than 900 keywords to search)
select *
from mytab
where myvar @@ to_tsquery(('ice'));
Is there a way to query multiple keywords using the _tsquery
and any way to speed the querying process.
Upvotes: 0
Views: 836
Reputation: 995
I'd suggest using keywords in a relational sense rather than having a running list of them under one field, which makes for terrible performance. Instead, you can have a table of keywords with id's as primary keys and have foreign keys referring to mytab
's primary keys. So you'd end up with the following:
keywords table
id | mytab_id | keyword
----------------------
1 1 liver
2 1 disease
3 1 varices
4 2 ice
mytab table
id | rest of fields
---------------------
1 ....
2 ....
You can then do an inner join to find what keywords belong to the specified entries in mytab
:
SELECT * FROM mytab
JOIN keywords ON keywords.mytab_id = mytab.id
WHERE keyword = 'ice'
You could also add a constraint to make sure the keyword
and mytab_id
pair is unique, that way you don't accidentally end up with the same keyword
for the same entry in mytab
.
Upvotes: 1