Dave
Dave

Reputation: 1950

Why does Google BigQuery return duplicate rows but doesn't count them?

If I do something like

select * from table where ID = 1234

I get 2 records.

if I run this query

select count(*) from table where ID = 1234

I get a count value of 1.

Has anyone experienced anything similar?

Right now, i don't know which answer is correct... do I have 1 record or 2 in the system?

Upvotes: 0

Views: 325

Answers (1)

Mikhail Berlyant
Mikhail Berlyant

Reputation: 172944

This scenario is possible if you have table with one row with id=1234, but this row has repeated field in it with two values.

BigQuery automatically flattens query results.
So below select just flattens out result and that's why you see two rows

SELECT * FROM table WHERE ID = 1234

whereas second statement below just counts rows, which is just one

SELECT COUNT(*) FROM table WHERE ID = 1234

See below real example that ressembles your case

SELECT * FROM [publicdata:samples.trigrams] 
WHERE ngram = 'to carry more'

above returns 241 rows, whereas below shows just 1 (one).

SELECT COUNT(*) FROM [publicdata:samples.trigrams]  
WHERE ngram = 'to carry more'

This is my guess of what happens with you. To make it certain - please provide schema of your table and maybe some data examples

You can read more about flattenning here

Upvotes: 1

Related Questions