Reputation: 26057
I'm playing with BQ and I create a table and inserted some data. I reinserted it and it created duplicates. I'm sure I'm missing something, but is there something I can do to ignore it if the data exists in the table?
My use case is I get a stream of data from various clients and sometimes their data will include some data they previously already sent(I have no control on them submitting).
Is there a way to prevent duplicates when certain conditions are met? The easy one is if the entire data is the same but also if certain columns are present?
Upvotes: 0
Views: 843
Reputation: 135
Just like the brother up there said. Using MERGE like :
MERGE `your_dataset.temp_table` T
USING (
SELECT
FORMAT_DATE('%b-%y', date) AS month,
EXTRACT(YEAR FROM date) as year,
FORMAT_DATE('%b', date) as month_text,
EXTRACT(MONTH FROM date) as month_num,
1.5 as metric_value
FROM (
SELECT
DATE_TRUNC(DATE_SUB(CURRENT_DATE(), INTERVAL 1 MONTH), MONTH) AS date) AS main
) S
ON T.month = S.month
AND T.year = S.year
AND T.month_text = S.month_text
AND T.month_num = S.month_num
WHEN NOT MATCHED THEN
INSERT VALUES (S.month, S.year,S.month_text, S.month_num, S.metric_value)
Upvotes: 0
Reputation: 1978
It's difficult to answer your question without a clear idea of the table structure, but it feels like you could be interested in the MERGE
statement: ref here.
With this DML statement you can perform a mix of INSERT
, UPDATE
, and DELETE
statements, hence do exactly what you are describing.
Upvotes: 2