Reputation: 539
I have table which have column ID, which define unique records, all the rows having same ID are duplicate, I want to remove the duplicate rows and keeping only unique ID
ID Data
1 2
1 2
2 3
2 3
After Removal (My Data in BigQuery Table)
ID Data
1 2
2 3
I can create a new table having unique records only but is there any way I can update the existing table i.e remove the duplicate records from the table.
Upvotes: 0
Views: 205
Reputation: 172993
Consider below ( all the rows having same ID are duplicate)
select as value any_value(t)
from `project.dataset.table` t
group by id
with output
In case , if definition of duplicate is - all the rows having all fields duplicate - consider below
select as value any_value(t)
from `project.dataset.table` t
group by format('%t', t)
Below is example of how to use it with DDL:
create or replace table `project.dataset.table` as
select * from (
select as value any_value(t)
from `project.dataset.table` t
group by format('%t', t)
);
Upvotes: 1