Reputation: 1572
I have a table with fields like this
F1 F2 F3 F4
1 4 A 5
2 4 A 5
3 4 B 5
4 4 C 5
5 4 D 5
I would like to delete all rows where F3 is a duplicate, so in order to get no duplicate values in field F3:
F1 F2 F3 F4
1 4 A 5
3 4 B 5
4 4 C 5
5 4 D 5
Thx!
Upvotes: 0
Views: 58
Reputation: 1269563
You can use aggregation:
select any_value(f1), any_value(f2), f3, any_value(f4)
from t
group by f3;
If you want all the values to be from a particular row, you can use row_number()
:
select t.* except (seqnum)
from (select t.*,
row_number() over (partition by f3 order by f3) as seqnum
from t
) t;
You can adjust the order by
key to specify which row you want to keep (say the largest f1
value or whatever).
And another method is:
select as value any_value(t)
from t
group by f3;
Upvotes: 1
Reputation: 172974
Below is for BigQuery Standard SQL
#standardSQL
SELECT AS VALUE ANY_VALUE(t)
FROM `project.dataset.table` t
GROUP BY F3
You can test, play with above using sample data from your question as in below example
#standardSQL
WITH `project.dataset.table` AS (
SELECT 1 F1, 4 F2, 'A' F3, 5 F4 UNION ALL
SELECT 2, 4, 'A', 5 UNION ALL
SELECT 3, 4, 'B', 5 UNION ALL
SELECT 4, 4, 'C', 5 UNION ALL
SELECT 5, 4, 'D', 5
)
SELECT AS VALUE ANY_VALUE(t)
FROM `project.dataset.table` t
GROUP BY F3
with result
Row F1 F2 F3 F4
1 1 4 A 5
2 3 4 B 5
3 4 4 C 5
4 5 4 D 5
Upvotes: 1