screechOwl
screechOwl

Reputation: 28129

SQL Server count number of distinct values in each column of a table

I have a table with ~150 columns. I'd like to find the count(distinct(colName)) for each column but am wondering if there's a way to do so without actually typing out each column name.

Ideally I would use count(distinct(*)) but this doesn't work.

Any other suggestions?

EDIT:

if this is my table:

  id         col1         col2        col3      ...
  01         10001       west         north  
  02         10001       west         south  
  03         10002       east         south  
  04         10002       west         north  
  05         10001       east         south  
  06         10003       west         north 

I'm looking for this output

count(distinct(id))   count(distinct(col1))    count(distinct(col2))   count(distinct(col3))
       6                       3                    2                      2

Upvotes: 2

Views: 11119

Answers (5)

J. Chris Compton
J. Chris Compton

Reputation: 556

Expanded answer from Bryan. His great answer lists the fields alphabetically.
This is no problem if you have a dozen fields or so. If you have 150 fields, like OP stated, this keeps the fields in their table's order.
I modified his query in order to examine a 213 column (vendor's) table and wanted to post for future reference.

DECLARE @Table SYSNAME = 'Your table name; without schema; no square brackets'; 

-- REVERSE and STUFF used to remove trailing UNION in string
SELECT REVERSE(STUFF(REVERSE((SELECT 'SELECT ' 
        + CAST(column_id AS VarChar(4)) + ' AS [column_id],' -- extra column
        + '''' + name 
        + ''' AS [Column], COUNT(DISTINCT(' 
        + QUOTENAME(name) + ')) AS [Count] FROM ' 
        + QUOTENAME(@Table) + ' UNION ' 
    -- get column name from sys.columns  
    FROM   sys.columns 
    WHERE  system_type_id NOT IN (34,240) AND object_id = Object_id(@Table) 
    ORDER BY column_id -- keeps columns in table order
    -- concatenate result strings with FOR XML PATH
    FOR XML PATH (''))), 1, 7, ';'));

I decided to not edit Bryan's answer because often people won't need the extra column.
(The ORDER BY has no effect if you don't add the column_id column. I believe this is because only the outermost ORDER BY is guarenteed to order the final output; I'd love to have a msft reference that confirms this)

EDIT: Using function Count with field types Image and Geography throws error.
Added "system_type_id NOT IN (34,240)".

Upvotes: 1

Bryan
Bryan

Reputation: 17693

Use the below script to build T-SQL query that will return a distinct count of each column in a table. Replace @Table value with your table name.

DECLARE @Table SYSNAME = 'TableName'; 

-- REVERSE and STUFF used to remove trailing UNION in string
SELECT REVERSE(STUFF(REVERSE((SELECT 'SELECT ''' + name 
                                     + ''' AS [Column], COUNT(DISTINCT(' 
                                     + QUOTENAME(name) + ')) AS [Count] FROM ' 
                                     + QUOTENAME(@Table) + ' UNION ' 
                              -- get column name from sys.columns  
                              FROM   sys.columns 
                              WHERE  object_id = Object_id(@Table)
                              -- concatenate result strings with FOR XML PATH
                              FOR XML PATH (''))), 1, 7, ';'));

Upvotes: 2

Void Ray
Void Ray

Reputation: 10199

This should do:

select count(*) from (select distinct * from myTable) as t

Here is SQL Fiddle test.

create table Data
(
    Id int,
    Data varchar(50)
)

insert into Data 
select 1, 'ABC'
union all
select 1, 'ABC'

select count(*) 
from (select distinct * from Data) as t

Upvotes: -1

rs.
rs.

Reputation: 27427

You can do this:

DECLARE @query varchar(max)
    SELECT @query = 
    'SELECT ' + SUBSTRING((SELECT ',' +'COUNT(DISTINCT(' + column_name + ')) 
             As ' + column_name + ' '  
             FROM information_schema.columns
             WHERE 
             table_name = 'table_name'
             for xml path('')),2,200000)  +  'FROM table_name'

PRINT(@query)

Upvotes: 6

RyanS
RyanS

Reputation: 78

I don't believe this is possible with just MySQL. I would think your going to have to use a server side language to get the results you want.

Use "DESC TABLE" as your first query and then for each "field" row, compile your query.


Ignore this, wrong system tag :)

Upvotes: 0

Related Questions