Reputation: 12716
I want to set the to_tsvector
language (e.g.: 'French') so it uses the proper dictionary when rendering the FTS vector.
Table messages has a locale_id
column, which is on the locales
table. Which I then need to join the locales
table to the languages
table on locale_id to get the actual language name.
This UPDATE is supposed to loop through all rows in messages
and set the vector
column = to_tsvector(joined language name, message)
, yet it updated each row to the same value and same language dictionary (e.g.: to_tsvector('french', stringX)
.
Why is this? Each row has a different message
string and a different locale_id
(meaning, different language name).
So if I simply disregard the language config for pg_dictionary and do:
UPDATE messages
SET vector = to_tsvector(message);
Table results:
Messages:
message | locale_id | vector
-----------------------------
Hi there | 1 | 'hi':1
Is a test | 2 | 'test':3
Le french | 3 | 'french':2 'le':1 --'le' SHOULD BE omitted since it's a stop word in French pg_dictionary
This works fine. Obviously, it's not loading in the correct language dict for each row. However, doing the following yields the same result for each row:
UPDATE messages
SET vector = to_tsvector(messages_languages.language::regconfig, messages_languages.message)
FROM (
select t3.language, t1.message
from messages as t1
inner join locales as t2 on (t1.locale_id = t2.id)
inner join languages as t3 on (t2.language_id = t3.id)
) messages_languages;
I've also tried it with a WITH
, same results:
WITH messages_languages as (
select t3.language, t1.message
from messages as t1
inner join locales as t2 on (t1.locale_id = t2.id)
inner join languages as t3 on (t2.language_id = t3.id)
)
UPDATE messages
SET vector = to_tsvector(messages_languages.language::regconfig, messages_languages.message)
FROM messages_languages;
Table results:
Messages:
message | locale_id | vector
-----------------------------
Hi there | 1 | 'french':2
Is a test | 2 | 'french':2
Le french | 3 | 'french':2 --'le' omitted correctly in french pg_dictionary as it's a STOP word
'french', for pg_dictionary_name = 'French', should be the only 'french':2 vector result in this table, yet all rows are the same
Locales:
id | language_id
------------------
1 | 4
2 | 5
3 | 6
Languages:
id | language
------------------
4 | 'English'
5 | 'German'
6 | 'French'
Upvotes: 0
Views: 1279
Reputation: 4523
UPDATE messages msg
SET vector = to_tsvector(lang.language::regconfig, msg.message)
FROM locales as loco
JOIN languages as lang ON loco.language_id = lang.id
WHERE msg.locale_id = loco.id
;
Upvotes: 1
Reputation: 12716
It turns out you must check the ID of the aliased subquery to the rows you're iterating through on the same table in UPDATE:
UPDATE messages
SET vector = to_tsvector(messages_languages.language::regconfig, messages_languages.message)
FROM (
select t1.id, t3.language, t1.message
from messages as t1
inner join locales as t2 on (t1.locale_id = t2.id)
inner join languages as t3 on (t2.language_id = t3.id)
) messages_languages
-- Need to make sure you're referencing the same row in the subquery by comparing IDs
WHERE messages.id = messages_languages.id;
Upvotes: 0