Hyperion
Hyperion

Reputation: 168

How to combine two BigQuery queries where I've used different fields to group by?

I have a table of transactions from a marketplace. It has three fields: buyer_email, seller_email, date.

I would like to know who are the most active buyers and sellers, assuming that buyers can be sellers and that sellers can be buyers. By "most active" I mean the users that have made the most transactions in the last N days - whether they are buyers or sellers.

I wrote this query to get the most active buyers:

SELECT buyer_email, COUNT(buyer_email) AS number_of_purchases
FROM table 
GROUP BY buyer_email
ORDER BY COUNT(buyer_email) DESC

The results look like this:

|      buyer_email                       |     number_of_purchases    |
| -------------------------------------- | -------------------------- |
|          [email protected]            |         74                 |
|          [email protected]         |         42                 |
|          [email protected]      |         31                 |
|          [email protected]       |         19                 |

And I wrote another query to get the list of most active sellers:

SELECT seller_email, COUNT(seller_email) AS number_of_sales
FROM table 
GROUP BY seller_email
ORDER BY COUNT(seller_email) DESC

The results of which look like this:

|      seller_email                  |     number_of_sales    |
| ---------------------------------- | ---------------------- |
|          [email protected]  |         156            |
|          [email protected]   |         89             |
|          [email protected]     |         23             |
|          [email protected]        |         12             |

I would like to combine both query results to get something like this:

|      user_email          |  number_of_sales    | number_of_purchases | total    |
| ------------------------ | ------------------- | ------------------- | -------- |
| [email protected] |         156         |           31        |    187   |
| [email protected]  |         89          |           19        |    108   | 
| [email protected]       |         12          |           74        |    86    |
| [email protected]    |         23          |           42        |    65    |

However, there are some things to take into account:

What I've tried:

I'm not sure if that's clear, but if anyone could help me achieve the aforementioned result, that would be great.

Upvotes: 1

Views: 155

Answers (2)

Mikhail Berlyant
Mikhail Berlyant

Reputation: 172993

Consider below approach

select *, purchase + sale as total from (
  select buyer_email as user_email, 'purchase' type from your_table
  union all select seller_email, 'sale' from your_table 
)
pivot (count(*) for type in ('purchase', 'sale'))

Upvotes: 2

tom10271
tom10271

Reputation: 4649

Use FULL OUTER JOIN.

WITH
    buyer_data AS (
        SELECT
            buyer_email        AS email,
            COUNT(buyer_email) AS number_of_purchases
        FROM table
        GROUP BY
            buyer_email
        ORDER BY
            COUNT(buyer_email) DESC
    ),
    seller_data AS (
        SELECT
            seller_email        AS email,
            COUNT(seller_email) AS number_of_sales
        FROM table
        GROUP BY
            seller_email
        ORDER BY
            COUNT(seller_email) DESC
    )
SELECT
    COALESCE(buyer_data.email, seller_data.email)                   AS email,
    COALESCE(number_of_purchases, 0)                                AS number_of_purchases,
    COALESCE(number_of_sales, 0)                                    AS number_of_sales,
    COALESCE(number_of_purchases, 0) + COALESCE(number_of_sales, 0) AS total
FROM buyer_data
FULL OUTER JOIN seller_data ON seller_data.email = buyer_data.email

Upvotes: 3

Related Questions