bry888
bry888

Reputation: 297

Count records filtered by a field in a struct (nested and repeated column) in BigQuery

I have such data structure in BigQuery:

timestamp
fruit
properties (a nested and repeated field)
    properties.species
    properties.color

Sample data:

timestamp, fruit, properties struct: {properties.species, properties.color}
1        , apples,                    lobo           , green
                                      babo           , green
2        , pears ,                    mango          , green
                                      obo            , green
3        , pears ,                    bingo          , yellow

                                      

I want to count fruit rows, which properties.color is green. So the answer for the example is 2.

Is there a better approach than counting timestamps? (You not always have a timestamp column in the data)

SELECT count(distinct timestamp) FROM table, unnest (properties) properties
WHERE properties.color = green

Upvotes: 1

Views: 1251

Answers (1)

Mikhail Berlyant
Mikhail Berlyant

Reputation: 173106

see few options below

select sum((
  select count(distinct color)
  from t.properties where color = 'green'
)) as green_rows
from `project.dataset.table` t        

or

select sum((
    select sign(countif(color = 'green')) 
    from t.properties 
  )) as green_rows
from `project.dataset.table` t 

Upvotes: 2

Related Questions