Reputation: 461
I have the following problem:
There is table LIST
with 5 columns:
There are around 100 000 rows
in this table. There is also table part
and history
and they all have part_number
column.
Now I need to update LIST.state column with value 1, 2 or 3 where LIST.refNo = some value
, depending on the result of the following situation:
If LIST.part_number does not exist in table part.part_number
set LIST.state to 1.
If it does, if this part_number and corresponding date in history is less than a week,
set LIST.State to 2. Else set it to 3.
I am using postgreSQL
. Any help would be greatly appreciated. So far, I know how to use this simple IF ELSE
statement:
DO
$BODY$
BEGIN
IF EXISTS (SELECT part_number FROM part WHERE refNo=7000) THEN
ELSE
UPDATE LIST set state = 1
END IF;
END;
$BODY$
I think its not the best approach because now I have to enclose this if with loop and since there are over 200 000 rows in history table
, comparing one by one will be super slow.
Maybe I can first select values that exist in part table and do queries on them to get values from history?
Upvotes: 1
Views: 344
Reputation: 657002
No plpgsql needed here.
UPDATE list l
SET state = u.state
FROM (
SELECT li.pk_col
, CASE WHEN h.hist_date > (now() - interval '1 week') THEN 2
WHEN p.part_number IS NULL THEN 1
ELSE 2
END AS state
FROM list li
LEFT JOIN part p USING (part_number)
LEFT JOIN history h USING (part_number)
WHERE li.refNo = <some value>
) u
WHERE l.pk_col = u.pk_col -- insert actual pk column
AND l.state IS DISTINCT FROM u.state;
Or, faster, but a bit more verbose without subquery:
UPDATE list l
SET state = CASE WHEN h.hist_date > (now() - interval '1 week') THEN 2
WHEN p.part_number IS NULL THEN 1
ELSE 2 END
FROM list li
LEFT JOIN part p USING (part_number)
LEFT JOIN history h USING (part_number)
WHERE li.refNo = <some value>
AND l.pk_col = li.pk_col -- insert actual pk column
AND l.state IS DISTINCT FROM
CASE WHEN h.hist_date > (now() - interval '1 week') THEN 2
WHEN p.part_number IS NULL THEN 1
ELSE 2 END
I am assuming the column history.hist_date
, since you did not specify.
Replace pk_col
with your actual primary key column(s). You did not specify.
You shouldn't be using unquoted mixed case identifiers like refNo
. Either use refno
or "refNo"
.
The last WHERE
condition avoids empty updates where state
already had the intended value. Omit if that can't happen in your case. Most of the time it's an improvement to performance people tend to forget.
For this to be fast you need indexes on the involved columns. PK columns are indexed automatically. You also need one for every part_number
, for list.refNo
and for history.hist_date
.
Ideally you'd have a multicolumn index on history
like:
CREATE INDEX h_multi_idx ON history (part_number, hist_date DESC);
Would trump one index for each column in this case.
Another way is a correlated subquery - simpler for updating a single or few row:
UPDATE list l
SET state = COALESCE((
SELECT CASE WHEN h.hist_date > (now() - interval '1 week') THEN 2 ELSE 3 END
FROM part p
LEFT JOIN history h USING (part_number)
WHERE p.part_number = l.part_number
), 1)
WHERE refNo = <some value>;
For lots of rows at once, correlated subqueries do not scale well.
Note that this updates the row even if state
already had the intended value .
For duplicates on (part_number)
in history
and (assuming) no duplicates on (part_number)
in part
or list
, here is one (of many possible) solutions:
UPDATE list l
SET state = u.state
FROM (
SELECT DISTINCT ON (h.part_number)
li.pk_col
, CASE WHEN h.hist_date > (now() - interval '1 week') THEN 2
WHEN p.part_number IS NULL THEN 1
ELSE 2
END AS state
FROM list li
LEFT JOIN part p USING (part_number)
LEFT JOIN history h USING (part_number)
WHERE li.refNo = <some value>
ORDER BY h.part_number, h.hist_date DESC NULLS LAST
) u
WHERE l.pk_col = u.pk_col -- insert actual pk column
AND l.state IS DISTINCT FROM u.state;
There may be more efficient ways. Depends on missing information.
Upvotes: 2
Reputation: 10959
You could do this with three separate queries (I'm sure there is a way to do it in one, like Erwins answer) which updates all values based on the ref no
at once.
Like this:
--First
UPDATE LIST
SET STATE = 1
WHERE NOT PART_NUM
IN (SELECT PART_NUM
FROM PART)
AND REF_no = 1000;
--Second
UPDATE LIST
SET STATE = 2
WHERE PART_NUM IN
(SELECT PART_NUM
FROM HISTORY
WHERE age(TIME, CURRENT_TIMESTAMP) > (INTERVAL '-7 days'))
AND REF_NO = 1000;
--Third
UPDATE LIST
SET STATE = 3
WHERE STATE NOT IN(1,2)
AND REF_NO = 1000
This is not very elegant but should get it done, cant be certain if it would be the fastest way either.
Here's the SQLFiddle
Upvotes: 1