user3142655
user3142655

Reputation: 109

BigQuery : Scalar Subquery produced more than one - Custom Dimensions

Im trying to get a custom dimension into my one of my unions, but im getting a issue with scalar sub query produced more than one element. I believe the issue lies within this code. Im trying to migrate across to standard SQL so please give answer in standard SQL.

SELECT
      d.value
FROM
      UNNEST(hits) AS hits,
      UNNEST(hits.customDimensions) AS d
WHERE
      d.index = 65) AS viewID,

Overall example of the query

#standardSQL
SELECT
  date,
  channelGrouping,
  viewID,
  SUM(Revenue) Revenue,
  SUM(Shipping) Shipping,
  SUM(bounces) bounces,
  SUM(transactions) transactions,
  COUNT(date) sessions
FROM (
  SELECT
    date,
    channelGrouping,
    'XXXXXXXXX' AS viewID,
    totals.totaltransactionrevenue / 1e6 Revenue,
    (
    SELECT
      SUM(hits.transaction.transactionshipping) / 1e6
    FROM
      UNNEST(hits) hits) Shipping,
    totals.bounces bounces,
    totals.transactions transactions
  FROM
    `XXXXXXXXX.ga_sessions_*`
  WHERE
    _TABLE_SUFFIX BETWEEN '20170625'
    AND '20170703'
  UNION ALL
  SELECT
    date,
    channelGrouping,
    'XXXXXXXXX' AS viewID,
    totals.totaltransactionrevenue / 1e6 Revenue,
    (
    SELECT
      SUM(hits.transaction.transactionshipping) / 1e6
    FROM
      UNNEST(hits) hits) Shipping,
    totals.bounces bounces,
    totals.transactions transactions
  FROM
    `XXXXXXXXX.ga_sessions_*`
  WHERE
    _TABLE_SUFFIX BETWEEN '20170625'
    AND '20170703'
  UNION ALL
  SELECT
    date,
    channelGrouping,
    (
    SELECT
      d.value
    FROM
      UNNEST(hits) AS hits,
      UNNEST(hits.customDimensions) AS d
    WHERE
      d.index = 65) AS viewID,
    totals.totaltransactionrevenue / 1e6 Revenue,
    (
    SELECT
      SUM(hits.transaction.transactionshipping) / 1e6
    FROM
      UNNEST(hits) hits) Shipping,
    totals.bounces bounces,
    totals.transactions transactions
  FROM
    `XXXXXXXXX.ga_sessions_*`
  WHERE
    _TABLE_SUFFIX BETWEEN '20170625'
    AND '20170703'
  UNION ALL
  SELECT
    date,
    channelGrouping,
    'XXXXXXXXX' AS viewID,
    totals.totaltransactionrevenue / 1e6 Revenue,
    (
    SELECT
      SUM(hits.transaction.transactionshipping) / 1e6
    FROM
      UNNEST(hits) hits) Shipping,
    totals.bounces bounces,
    totals.transactions transactions
  FROM
    `XXXXXXXXX.ga_sessions_*`
  WHERE
    _TABLE_SUFFIX BETWEEN '20170625'
    AND '20170703'
  UNION ALL
  SELECT
    date,
    channelGrouping,
    'XXXXXXXXX' AS viewID,
    totals.totaltransactionrevenue / 1e6 Revenue,
    (
    SELECT
      SUM(hits.transaction.transactionshipping) / 1e6
    FROM
      UNNEST(hits) hits) Shipping,
    totals.bounces bounces,
    totals.transactions transactions
  FROM
    `XXXXXXXXX.ga_sessions_*`
  WHERE
    _TABLE_SUFFIX BETWEEN '20170625'
    AND '20170703' )
GROUP BY
  date,
  channelGrouping,
  viewID

Upvotes: 0

Views: 14378

Answers (2)

Willian Fuks
Willian Fuks

Reputation: 11787

You can simulate some data in BigQuery to better understand what's going on here.

For instance, this data simulates the hits schema in ga_sessions:

WITH data AS(
  select ARRAY<STRUCT<hitNumber INT64, customDimensions ARRAY<STRUCT<index INT64, value STRING>> >> [STRUCT(1 as hitNumber, [STRUCT(1 as index, 'val1' as value), STRUCT(2 as index, 'val2' as value), STRUCT(3 as index, 'val3' as value)] as customDimensions), STRUCT(2 as hitNumber, [STRUCT(1 as index, 'val1' as value)] as customDimensions)] hits
)

select * from data

Now, if you run your query against this simulated data querying where index = 1, you will get the same error because in two different places the index is 1.

In order for this to work, you would have to bring it as an ARRAY like so:

SELECT
  array(select custd.value from unnest(hits) hits, unnest(hits.customDimensions) custd where index = 1)
FROM data

And you'll see the result:

enter image description here

So in your query you'll have to adapt for either returning this value as an ARRAY or, if for all values where index=65 the value is the same, you can do something like:

SELECT
  (select custd.value from unnest(hits) hits, unnest(hits.customDimensions) custd where index = 1 limit 1)
FROM data

This will bring just one result in the scalar sub-query.

Upvotes: 3

Elliott Brossard
Elliott Brossard

Reputation: 33755

The problem is that some or all hits have a custom dimension with an index of 65. There are a few different ways to address this. You can use an ARRAY subquery to get all of the values with that index:

ARRAY(
  SELECT
    d.value
  FROM
    UNNEST(hits) AS hits,
    UNNEST(hits.customDimensions) AS d
  WHERE
    d.index = 65) AS viewIDs,

This will give you all view IDs across hits, but you would need to use an array for viewID in the first query of the union as well. Another option is just to get the view ID from the first hit:

(
  SELECT
    d.value
  FROM
    UNNEST(hits[SAFE_OFFSET(0)].customDimensions) AS d
  WHERE
    d.index = 65) AS viewID

Or, if you don't care which view ID you get, you could use a LIMIT to get an arbitrary ID:

(
  SELECT
    d.value
  FROM
    UNNEST(hits) AS hits,
    UNNEST(hits.customDimensions) AS d
  WHERE
    d.index = 65
  LIMIT 1) AS viewID,

Upvotes: 2

Related Questions