Gabi
Gabi

Reputation: 453

Working with arrays with BigQuery LegacySQL

Each row in my table has a field that is an array, and I'd like to get a field from the first array entry.

For example, if my row is

[
  {
    "user_dim": {
      "user_id": "123",
      "user_properties": [
        {
          "key": "content_group",
          "value": {
            "value": {
              "string_value": "my_group"
            }
          }
        }
      ]
    },
    "event_dim": [
      {
        "name": "main_menu_item_selected",
        "timestamp_micros": "1517584420597000"
      },
      {
        "name": "screen_view",
        "timestamp_micros": "1517584420679001"
      }
    ]
  }
]

I'd like to get

user_id: 123, content_group: my_group, timestamp_1517584420597000

Upvotes: 2

Views: 629

Answers (1)

Mikhail Berlyant
Mikhail Berlyant

Reputation: 173190

As Elliott mentioned - BigQuery Standard SQL has way much better support for ARRAYs than legacy SQL. And in general, BigQuery team recommend using Standard SQL

So, below is for BigQuery Standard SQL (including handling wildcard stuff)

#standardSQL
SELECT 
  user_dim.user_id AS user_id,
  (SELECT value.value.string_value 
     FROM UNNEST(user_dim.user_properties) 
     WHERE key = 'content_group' LIMIT 1
  ) content_group,
  (SELECT event.timestamp_micros 
     FROM UNNEST(event_dim) event 
     WHERE name = 'main_menu_item_selected'
  ) ts
FROM `project.dataset.app_events_*`
WHERE _TABLE_SUFFIX BETWEEN '20180129' AND '20180202'   

with result (for the dummy example from your question)

Row     user_id     content_group       ts   
1       123         my_group            1517584420597000     

Upvotes: 2

Related Questions