Reputation: 539
I have duplicated values in my data. However, from the duplicated values, i only want to store 1 values and remove the rest of same duplicated values. So far, I have found the solution where they remove ALL the duplicated values like this.
Code:
SELECT ID, a.date as date.A, b.date as date.B,
CASE WHEN a.date <> b.date THEN NULL END AS b.date
except(date.A)
FROM
table1 a LEFT JOIN table2 b
USING (ID)
WHERE date.A = 1
Sample input:
Sample output (Store only 1 values from the duplicated values and remove the rest):
NOTE: query might wrong as it remove all duplicated values.
Upvotes: 0
Views: 379
Reputation: 1269773
You can use aggregation. Something like this:
SELECT ANY_VALUE(a).*, ANY_VALUE(b).*
FROM table1 a LEFT JOIN
table2 b
USING (ID)
WHERE date.A = 1
GROUP BY id, a.date;
For each id/
datecombination, this returns an arbitrary matching row from
a/
b`.
Upvotes: 1
Reputation: 4042
Considering your screenshot's sample data and your explanation. I understand that you want to remove duplicates from your table retaining only one row of unique data. Thus, I was able to create a query to select only one row of data ignoring the duplicates.
In order to select the rows without any duplicates, you can use SELECT DISTINCT. According to the documentation, it discards any duplicate rows. In addition to this method, CREATE TABLE statement will also be used to create a new table (or replace the previous one) with the new data without duplicates. The syntax is as follows:
CREATE OR REPLACE TABLE project_id.dataset.table AS
SELECT DISTINCT ID, a.date as date.A, b.date as date.B,
CASE WHEN a.date <> b.date THEN NULL END AS b.date
except(date.A)
FROM
table1 a LEFT JOIN table2 b
USING (ID)
WHERE date.A = 1
And the output will be exactly the same as you shared in your question.
Notice that I used CREATE OR REPLACE, which means if you set project_id.dataset.table to the same path as the table within your select, it will replace your current table (in case you have the data coming from one unique table). Otherwise, it will create a new table with the specified new table's name.
Upvotes: 1