Vadim Tikanov
Vadim Tikanov

Reputation: 671

Changing query to avoid "Aggregations of aggregations are not allowed" in Bigquery

Given user and order tables, I need to count users who made their first order on the next day after registration date.

I managed to list such users with the following query:

SELECT 
  users.first_name as first_name,
  users.last_name as last_name,
  users.registration_date as registration_date,
  min(orders.order_date) as first_order_date
FROM `users_table` as users
  JOIN `orders_table` as orders
  ON users.id = orders.user_id
GROUP BY
  first_name,
  last_name,
  registration_date
HAVING
  date_diff(first_order_date, registration_date, DAY) = 1
ORDER BY
  registration_date ASC
LIMIT 5

Resulting in:

+------------+-----------+-------------------+------------------+
| first_name | last_name | registration_date | first_order_date |
+------------+-----------+-------------------+------------------+
| Albert     | Ellis     | 2013-04-11        | 2013-04-12       |
| Charles    | Moore     | 2014-04-29        | 2014-04-30       |
| Jimmy      | Payne     | 2014-07-01        | 2014-07-02       |
| Angela     | Stanley   | 2014-10-21        | 2014-10-22       |
| Marie      | Bishop    | 2014-11-15        | 2014-11-16       |
+------------+-----------+-------------------+------------------+

Now, I can't wrap my head around counting them. When I try something like:

SELECT 
  count(date_diff(min(orders.order_date), users.registration_date, DAY) = 1)
FROM `users_table` as users
  JOIN `orders_table` as orders
  ON users.id = orders.user_id

I receive "Aggregations of aggregations are not allowed" error. How do I amend query to resolve that?

Upvotes: 3

Views: 12045

Answers (3)

Mikhail Berlyant
Mikhail Berlyant

Reputation: 172944

Below is for BigQuery Standard SQL

#standardSQL
SELECT COUNT(1) next_day_order_users
FROM `project.dataset.users_table` AS users
JOIN (
  SELECT user_id, MIN(order_date) first_order_date 
  FROM `project.dataset.orders_table`
  GROUP BY user_id
) AS orders
ON users.id = orders.user_id
WHERE DATE_DIFF(first_order_date, registration_date, DAY) = 1

Upvotes: 4

Gordon Linoff
Gordon Linoff

Reputation: 1269445

Why not just use a JOIN condition?

SELECT COUNT(DISTINCT u.id)
FROM `users_table` u JOIN
     `orders_table` o        
     ON u.id = o.user_id AND
        date_diff(o.order_date, u.registration_date, DAY) = 1;

The COUNT(DISTINCT accounts for the fact that users could have multiple orders in one day.

Upvotes: 0

lypskee
lypskee

Reputation: 342

Just put your query into subquery. You are already choosing clients who ordered next day after registration. So answer is a number of rows in your query

select count(1)
from ( SELECT 
  users.first_name as first_name,
  users.last_name as last_name,
  users.registration_date as registration_date,
  min(orders.order_date) as first_order_date
FROM `users_table` as users
  JOIN `orders_table` as orders
  ON users.id = orders.user_id
GROUP BY
  first_name,
  last_name,
  registration_date
HAVING
  date_diff(first_order_date, registration_date, DAY) = 1 ) x

Upvotes: 2

Related Questions