GPGVM
GPGVM

Reputation: 5619

TSQL sub query invalid column and only one expression

It has been a while since writing T-SQL for me and I know this can be done but my memory is good enough to get me close (I think I'm close) but poor enough to not get it right.

To start I have this query:

    SELECT DISTINCT(COMM_TYPE), 
        COUNT(COMM_TYPE) AS 'Total'
    FROM 
        [MYDB].[dbo].[COMM]
    GROUP BY 
        COMM_TYPE

Which returns:

COMM_TYPE        Total
--------------------------
TypeA             1
TypeB            44474
TypeC             3
TypeD            3854
TypeE            12327
TypeF            362912
TypeG            484344
TypeH             386
TypeI             106

This is an accurate result.

So now I want the above PLUS a sample of each one. Something with columns like:

ID     COMM_TYPE     TOTAL     DATA     COMMENTS     PRIMARY     COMM_NUMBER

I believe this can be done with a sub query but I am not writing it correctly as I get two errors.

Msg 207, Level 16, State 1, Line 10
Invalid column name 'CT'.

Msg 116, Level 16, State 1, Line 7
Only one expression can be specified in the select list when the subquery is not introduced with EXISTS.

The second error I understand. My sub query has two columns being returned but positioned in the select as I have it wants only one.

The first error I'm more lost on. I thought I could reference an sub query column in the outer query?

Here is the query:

SELECT TOP(1) 
    *,
    (SELECT 
        DISTINCT(COMM_TYPE), 
        COUNT(COMM_TYPE)
     FROM 
        [MYDB].[dbo].[COMM]
     GROUP BY 
        COMM_TYPE) AS CT
FROM 
    [MYDB].[dbo].[COMM]
WHERE 
    CT = COMM_TYPE

This is mostly for myself but if it helps anyone here ya go:

We start with a (cte to wrap the entire operation as it bring many benefits but the two applicable here are:

1.Enable grouping by a column that is derived from a scalar subselect. 2.Reference the resulting table multiple times in the same statement

WITH T
     AS (
          CTE SELECT Statement
        )
FINAL SELECT Statement

Next our CTE select basically return three columns for us.

1.Total which in my query was COUNT on a column 2.RN which is the row number 3.Wildcard * which gets all the columns from the table

Now from this point we get into the Partitioning....

So it seems that we need to choose how we are going to break this table up. Since I had defined DISTINCT(COMM_TYPE) without realizing it there was my partition....in that first column definition we also do a count(*). So what must be happening is that first SQL engine breaks table into pieces (partitions) then does a count of records in those pieces....????

SELECT Count(*) 
    OVER (PARTITION BY COMM_TYPE) AS Total,

Next we do a row_number() operation OVER (aka operating against) again my partition of COMM_TYPE...we then order it and project the column name of rn....kinda not sure why this is needed till I got to the end then it made sense.

Row_number() 
    OVER (PARTITION BY COMM_TYPE 
    ORDER BY COMM_TYPE) AS RN,

finally we just pull a wildcard which is every column in the table.

So in the depths of the SQL engine namespace memory registers this must be quite a big hunk of data with these repeated grouping operations "OVER" everything.

However all we see is a single row and that is because of the last select which gives me everything all mushed together as I wanted and we only get the TOP(1) because of that RN column I didn't understand earlier.

Do I understand it properly?

Upvotes: 1

Views: 244

Answers (1)

Martin Smith
Martin Smith

Reputation: 453067

This should do what you need.

WITH T 
     AS (SELECT Count(*) 
                  OVER (PARTITION BY COMM_TYPE) AS Total, 
                Row_number() 
                  OVER (PARTITION BY COMM_TYPE 
                            ORDER BY COMM_TYPE)     AS RN,
         *
         FROM   MyDb.dbo.Comm) 
SELECT * 
FROM   T 
WHERE  RN = 1 

Upvotes: 1

Related Questions