damd
damd

Reputation: 6967

Referencing other columns in a SQL SELECT

I have a SQL query in BigQuery:

SELECT
  creator.country,
  (SUM(length) / 60) AS total_minutes,
  COUNT(DISTINCT creator.id) AS total_users,
  (SUM(length) / 60 / COUNT(DISTINCT creator.id)) AS minutes_per_user
FROM
  ...

You may have noticed that the last column is equivalent to total_minutes / total_users.

I tried this, but it doesn't work:

SELECT
  creator.country,
  (SUM(length) / 60) AS total_minutes,
  COUNT(DISTINCT creator.id) AS total_users,
  (total_minutes / total_users) AS minutes_per_user
FROM
  ...

Is there any way to make this simpler?

Upvotes: 3

Views: 2152

Answers (2)

Mikhail Berlyant
Mikhail Berlyant

Reputation: 173028

Another option is to move all business logic of metrics calculation into UDF (temp or permanent depends on usage needs) ...

create temp function custom_stats(arr any type) as ((
  select as struct    
    sum(length) / 60 as total_minutes,
    count(distinct id) as total_users,
    sum(length) / 60 / count(distinct id) as minutes_per_user
  from unnest(arr)
));

... and thus keep query itself simple and least verbose - as in below example

select creator.country,
  custom_stats(array_agg(struct(length, creator.id))).*
from `project.dataset.table`
group by country

Upvotes: 1

Gordon Linoff
Gordon Linoff

Reputation: 1269973

Not really. That is, you cannot re-use column aliases in expressions in the same SELECT. If you really want, you can use a subquery or CTE:

SELECT c.*,
       total_minutes / total_users
FROM (SELECT creator.country,
             (SUM(length) / 60) AS total_minutes,
              COUNT(DISTINCT creator.id) AS total_users
      FROM
     ) c;

Upvotes: 4

Related Questions