Timurib
Timurib

Reputation: 2768

Duplicate key violation on UPDATE FROM SELECT query in Vertica

I want to UPDATE some column in dimension table client according to event table data:

UPDATE client AS c
SET 
    some_attr_last_value = last_attr_values.value 
FROM (

  SELECT
    l.client_id,
    value,
    ROW_NUMBER() OVER (PARTITION BY l.client_id ORDER BY timestamp DESC) AS num
  FROM event AS l
  WHERE 
    timestamp > DATE_TRUNC('month', NOW() - INTERVAL '1 month')

) AS last_attr_values

WHERE 
    last_attr_values.num = 1 AND 
    c.client_id = last_attr_values.client_id AND 
    c.some_attr_last_value <> last_attr_values.value;

I already have a few queries like above: they are update other columns in same manner (just with other columns) and works well. But one query produces the error:

ERROR 6745: Duplicate key values: 'client_id=…' -- violates constraint '… .client.C_PRIMARY'

But I don't try to change a client_id. Why this error occured?

I check the client table, but it have not duplicates. The ANALYZE_CONSTRAINTS('client') also returns no violations.

Version is 7.2.2-1.

Upvotes: 2

Views: 789

Answers (2)

Dave Poole
Dave Poole

Reputation: 1068

The problem isn't with your client table it is with your inner select.

SELECT
    l.client_id,
    value,
    ROW_NUMBER() OVER (PARTITION BY l.client_id ORDER BY timestamp DESC) AS num
  FROM event AS l
  WHERE 
    timestamp > DATE_TRUNC('month', NOW() - INTERVAL '1 month')

What is it in the SELECT that ensures that you have one record per client?

Let's suppose that you have events where you have multiple events for multiple clients.

  • Client_id 1 10:00
  • Client_id 1 10:05
  • Client_id 9 10:05
  • Client_id 9 10:06
  • Client_id 1 10:07

Because your window function is ordered by timestamp

ROW_NUMBER() OVER (PARTITION BY l.client_id ORDER BY timestamp DESC) AS num

You will get

  • Client_id 1 1
  • Client_id 1 2
  • Client_id 9 1
  • Client_id 9 2
  • Client_id 1 1 <---- 2nd Client_id 1 with a Row_number of 1!

I believe that you need to include Client_id in your ORDER BY for your query to work.


Update: After some time the error happened again. I tried to use the above solution, but unfortunately it did not help. The error was overcome only using a temporary table:

CREATE LOCAL TEMPORARY TABLE last_values ON COMMIT PRESERVE ROWS 
AS SELECT client_id, value FROM (
  SELECT
    l.client_id,
    value,
    ROW_NUMBER() OVER (PARTITION BY l.client_id ORDER BY timestamp DESC) AS num
  FROM event AS l WHERE timestamp > DATE_TRUNC('month', NOW() - INTERVAL '1 month')
) last_values WHERE last_values.num = 1;

UPDATE client AS c 
SET some_attr_last_value = last_values.value
FROM last_values 
WHERE c.client_id = last_values.client_id AND c.some_attr_last_value <> last_attr_values.value

DROP TABLE last_values;

So although the problem resolved, I did not find out its exact cause.

Upvotes: 3

woot
woot

Reputation: 7616

Vertica checks constraints at SELECT time normally, not when you UPDATE. Check your table for dups.

I think the latest release of Vertica does have some options now for checking at INSERT and UPDATE but that isn't the default.

Upvotes: 1

Related Questions