Reputation: 423
I need help with a group-by query. My table looks like this:
CREATE MULTISET TABLE MY_TABLE (PERSON CHAR(1), ITEM CHAR(1), COST INT);
INSERT INTO MY_TABLE VALUES ('A', '1', 5);
INSERT INTO MY_TABLE VALUES ('A', '1', 5);
INSERT INTO MY_TABLE VALUES ('A', '2', 1);
INSERT INTO MY_TABLE VALUES ('B', '3', 0);
INSERT INTO MY_TABLE VALUES ('B', '4', 10);
INSERT INTO MY_TABLE VALUES ('B', '4', 10);
INSERT INTO MY_TABLE VALUES ('C', '5', 1);
INSERT INTO MY_TABLE VALUES ('C', '5', 1);
INSERT INTO MY_TABLE VALUES ('C', '5', 1);
+--------+------+------+
| PERSON | ITEM | COST |
+--------+------+------+
| A | 1 | 5 |
| A | 1 | 5 |
| A | 2 | 1 |
| B | 3 | 0 |
| B | 4 | 10 |
| B | 4 | 10 |
| C | 5 | 1 |
| C | 5 | 1 |
| C | 5 | 1 |
+--------+------+------+
I need to group items and costs by person, but in different ways. For each person, I need the number of unique items they have. Ex: Person A
has two distinct items, item 1 and item 2. I can get this with COUNT(DISTINCT ITEM)
.
Then for each person, I need to sum the cost but only once per distinct item (for duplicate items, the cost is always the same). Ex: Person A
has item 1 for $5, item 1 for $5, and item 2 for $1. Since this person has item 1 twice, I count the $5 once, and then add the $1 from item 2 for a total of $6. The output should look like this:
+--------+---------------------+------------------------+
| PERSON | ITEM_DISTINCT_COUNT | COST_DISTINCT_ITEM_SUM |
+--------+---------------------+------------------------+
| A | 2 | 6 |
| B | 2 | 10 |
| C | 1 | 1 |
+--------+---------------------+------------------------+
Is there an easy way to do this that performs good on a lot of rows?
SELECT PERSON
,COUNT(DISTINCT ITEM) ITEM_DISTINCT_COUNT
-- help with COST_DISTINCT_ITEM_SUM
FROM MY_TABLE
GROUP BY PERSON
Upvotes: 0
Views: 2190
Reputation: 1270371
I recommend a two levels of aggregation:
select person, count(*) as num_items, sum(cost)
from (select person, item, avg(cost) as cost
from my_table t
group by person, item
) t
group by person;
Upvotes: 0
Reputation: 147206
You can make a subquery which gets the distinct values of item
and cost
for each person, and then aggregate over that:
SELECT PERSON,
COUNT(ITEM) AS ITEM_DISTINCT_COUNT,
SUM(COST) AS COST_DISTINCT_ITEM_SUM
FROM (
SELECT DISTINCT PERSON, ITEM, COST
FROM MY_TABLE
) M
GROUP BY PERSON
Output:
PERSON ITEM_DISTINCT_COUNT COST_DISTINCT_ITEM_SUM
A 2 6
B 2 10
C 1 1
Upvotes: 3