Reputation: 335
I have a table "emp" as like follows in BIG QUERY
If we group by all columns or distinct records then we will get 3 as count, if we count(*) then it will give 4
I need to write a query which has to return "1", if distinct records and count(*) values are not equal (3 != 4 in our case) else "0" as flag like as below
Upvotes: 0
Views: 833
Reputation: 1804
This might scale a little nicer ...
Why nicer ?
#standard sql:
select
b.table_id Table_Name
,round(safe_divide(count(distinct(FARM_FINGERPRINT(FORMAT('%T', a)))) , row_count),4) dup_ratio
,cast( trunc(safe_divide(count(distinct(FARM_FINGERPRINT(FORMAT('%T', a)))) , row_count)) as INT64) your_flag
from
`bigquery-public-data.austin_bikeshare.*` a
inner join
bigquery-public-data.austin_bikeshare.__TABLES__ b
on a._TABLE_SUFFIX = b.table_id
group by
table_id ,row_count
Result :
Table_Name dup_ratio your_flag
bikeshare_stations 0.0104 0
bikeshare_trips 1.0 1
Upvotes: 0
Reputation: 173013
Below is for BigQuery Standard SQL
#standardSQL
SELECT IF(COUNT(*) = COUNT(DISTINCT FORMAT('%t', t)), 0, 1) AS flag
FROM `project.dataset.emp` t
if to apply to sample data from your question - result is
Row flag
1 1
Upvotes: 1
Reputation: 1269953
If you want to find duplicates, you can use:
select empno, empname, deptno
from t
group by empno, empname, deptno
having count(*) > 1;
You can then incorporate this into a query to return a flag if you like:
select exists (select 1
from t
group by empno, empname, deptno
having count(*) > 1
) as flag
Upvotes: 0