Renato Dinhani
Renato Dinhani

Reputation: 36756

How to speed up this type of query?

I have a table with words from a text (the table is called token), each word is a row in the table. I want to retrieve adjacent words in the result.

Example: My name is Renato must return:

My | name
name | is
is | Renato

The following query works, but is slow. The textblockid determines the text that the word belongs, the sentence is the sentence count in the textblock (but at the moment the value is 1 for all) and the position attribute determines the order of the words.

select w1.text,w2.text 
from token as w1,
    (select textblockid,sentence,position,text from token
    order by textblockid,sentence,position) as w2
where w1.textblockid = w2.textblockid
and w1.sentence = w2.sentence
and w1.position = w2.position - 1

Is there a better/faster way to do this?

Upvotes: 0

Views: 81

Answers (2)

bfavaretto
bfavaretto

Reputation: 71939

Maybe an INNER JOIN with a second instance of token performs better. But it all depends on the data types of your columns, and the indexes you have in place.

For example, if sentence is a text column, the comparison between w1.sentence and w2.sentence will probably be very expensive. If it's a numeric id (a foreign key to a sentences table), and if you have an index on the column, it should be way faster. Assuming this last scenario, you could try this:

select w1.text,w2.text 
from token as w1
    INNER JOIN token as w2
    ON w2.sentence = w1.sentence
    AND w1.position = w2.position - 1
    AND w1.textblockid = w2.textblockid

Upvotes: 0

pkmiec
pkmiec

Reputation: 2694

I don't know postgresql in detail but for sure query could be simpler in sql server:

select w1.text,w2.text 
from token as w1, token as w2
where w1.textblockid = w2.textblockid
and w1.sentence = w2.sentence
and w1.position = w2.position - 1

(I think it is better to use simplest query and leave the rest for optimizer, which may be misleaded by your from subquery).

However if you have index on (textblockid, sentence, position) you really can't get anything more with sql.

Upvotes: 1

Related Questions