Josh Chang
Josh Chang

Reputation: 75

Grouping data without an aggregate function in SQL

I was wondering if anyone could lend some insight into this problem. I'm managing a number of stock portfolios which must contain the same stocks. I need to design a query that will compare all portfolios against all portfolios and return the stocks that exist in one, but not the other.

For simplicity's sake, let's say I have a table that looks like this:

stock_symbol     portfolio
AAPL                A
IBM                 A
MCD                 A
NFLX                A
AAPL                B
IBM                 B
MCD                 B
FB                  B
AAPL                C
IBM                 C
MCD                 C

Ideally, I want the query to return something like this:

p1    p2    stock_symbol
A     B         NFLX
A     C         NFLX
B     A          FB
B     C          FB

So comparing A to B will return NFLX, while comparing B to A will return FB.

Currently, I've got a query that works with a small number of portfolios, but I'm going to be managing >20 portfolios soon. That's hundreds of comparisons. I want to use GROUP BY, but I don't have an aggregate function.

Any ideas as to what I can do? Thanks!

Upvotes: 0

Views: 63

Answers (2)

Gordon Linoff
Gordon Linoff

Reputation: 1270421

This type of query doesn't need group by. It needs left join. Here is an example query that should do what you want:

select p1.portfolio, p2.portfolio, p1.stock_symbol
from table p1 left join
     table p2
     on p1.stock_symbol = p2.stock_symbol and
        p1.portfolio <> p2.portfolio
where p2.stock_symbol is null;

EDIT:

That is such a good point that p2.portfolio will be NULL. Here is a better solution:

select p1.portfolio, p2.portfolio, p1.stock_symbol
from (select distinct portfolio from table)  p1 cross join
     (select distinct portfolio from table) p2 left join
     table sp1 
     on sp1.portfolio = p1.portfolio left join
     table sp2
     on sp1.stock_symbol = sp2.stock_symbol 
where sp2.stock_symbol is null;

Upvotes: 3

SoulTrain
SoulTrain

Reputation: 1904

Give this a try...

Setup:

select * into #tbla from (
SELECT 'AAPL' stock_symbol, 'A' portfolio union all
SELECT 'IBM ',              'A' union all
SELECT 'MCD ',              'A'union all
SELECT 'NFLX',              'A'union all
SELECT 'AAPL',              'B'union all
SELECT 'IBM ',              'B'union all
SELECT 'MCD ',              'B'union all
SELECT 'FB  ',              'B'union all
SELECT 'AAPL',              'C'union all
SELECT 'IBM ',              'C'union all
SELECT 'MCD ',              'C'
)a

Query:

    WITH symbols
AS (
    SELECT DISTINCT stock_symbol
    FROM #tbla
    )
    ,portfolios
AS (
    SELECT DISTINCT portfolio
    FROM #tbla
    )
    ,mstr
AS (
    SELECT *
    FROM portfolios A
    CROSS JOIN symbols B
    )
    ,interim
AS (
    SELECT *
    FROM mstr m
    WHERE NOT EXISTS (
            SELECT 1
            FROM #tbla A
            WHERE m.portfolio = A.portfolio
                AND m.stock_symbol = A.stock_symbol
            )
    )
SELECT A.portfolio
    ,b.portfolio
    ,b.stock_symbol
FROM interim A
CROSS JOIN interim B
WHERE A.portfolio <> B.portfolio
    AND A.stock_symbol <> B.stock_symbol

Result:

portfolio   portfolio   stock_symbol
A           B             NFLX
A           C             NFLX
B           A             FB  
B           C             FB  
C           B             NFLX
C           A             FB  

Upvotes: 0

Related Questions