Reputation: 869
I have a ~70K-entry table for orders, as follows:
I would like to, on a per-customer basis, determine what the most common order is, as well as to what certainty that order is (sample size and probability).
This is what I have so far:
CREATE VIEW CustomerOrderProbabaility as
SELECT Distinct(customerID)
customerID,
order,
COUNT(*) as sampleSize
FROM (Select customerID, order1 AS order FROM orderTable UNION
Select customerID, order2 AS order FROM orderTable UNION
Select customerID, order3 AS order FROM orderTable
)
GROUP BY customerID, order
ORDER BY customerID, COUNT(*) DESC;
And I'm getting a table of customerId
s and order
, but sampleSize
is always 1
. Where am I going wrong?
Upvotes: 0
Views: 26
Reputation: 1269703
I think you want UNION ALL
as well as some other changes:
CREATE VIEW CustomerOrderProbabaility as
SELECT DISTINCT ON (customerID)
customerID,
order,
COUNT(*) as sampleSize,
SUM(COUNT(*)) OVER (PARTITION BY customerId) as totOrders
FROM (Select customerID, order1 AS theorder FROM orderTable UNION ALL
Select customerID, order2 AS theorder FROM orderTable UNION ALL
Select customerID, order3 AS theorder FROM orderTable
) co
GROUP BY customerID, theorder
ORDER BY customerID, COUNT(*) DESC;
UNION
removes duplicates.
Changes:
order
to theorder
. order
is a keyword. Even if accepted as a column name, I don't think it is a good idea.UNION ALL
instead of UNION
so duplicates are not removed.DISTINCT ON
instead of DISTINCT
, because this is your intention.TotOrders
to count all orders for each customer.Upvotes: 1