alwaysVBNET
alwaysVBNET

Reputation: 3310

Return a percentage of NULL values for a specific record

The following query returns the values of the table for each field in terms of null percentage . What I want is to get the sum of those percentages for a specific ProductID. Also, I would like to get a percentage (in an extra column) of the fields do not have value i.e. ="". Any ideas?

use AdventureWorks 
DECLARE @TotalCount decimal(10,2), @SQL NVARCHAR(MAX)
SELECT @TotalCount = COUNT(*) FROM [AdventureWorks].[Production].[Product] 

SELECT @SQL = 

COALESCE(@SQL + ', ','SELECT ') +
'cast(sum (case when ' + QUOTENAME(column_Name) + 
' IS NULL then 1 else 0 end)/@TotalCount*100.00 as decimal(10,2)) as [' +
column_Name + ' NULL %]
'
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = 'Product' and TABLE_SCHEMA = 'Production'
SET @SQL = 'set @TotalCount = NULLIF(@TotalCount,0) 
' + @SQL + '
FROM [AdventureWorks].Production.Product'
print @SQL
EXECUTE SP_EXECUTESQL @SQL, N'@TotalCount decimal(10,2)', @TotalCount

Upvotes: 1

Views: 1709

Answers (2)

FrankPl
FrankPl

Reputation: 13315

You can use the following:

use AdventureWorks

DECLARE @colCount int;
DECLARE @nullCheck nvarchar(max) = N'';
DECLARE @emptyCheck nvarchar(max) = N'';
DECLARE @SQL NVARCHAR(MAX);
DECLARE @KeyToCheck int = 123; -- adapt as necessary

SELECT 
    @nullCheck += '
         + ' + 'count(' + QUOTENAME(column_Name) + ')'
   ,@emptyCheck += '
         + ' +
         CASE
             WHEN DATA_TYPE IN('bigint', 'int', 'smallint', 'tinyint', 'bit', 'money', 'smallmoney', 'numeric', 'decimal', 'float', 'real') THEN
                 -- check numeric data for zero
                 'sum(case when coalesce(' + QUOTENAME(column_Name) + ', 0) = 0 then 1 else 0 end)' 
             WHEN DATA_TYPE like '%char' or DATA_TYPE like '%text' THEN 
                 --check character data types for empty string
                 'sum(case when coalesce(' + QUOTENAME(column_Name) + ', '''') = '''' then 1 else 0 end)'
             ELSE -- otherwise, only check for null
                 'sum(case when ' + QUOTENAME(column_Name) + ' IS NULL then 1 else 0 end)'
         END
   ,@colCount = 
        count(*) over()

FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = 'Product' and TABLE_SCHEMA = 'Production'
;

SET @SQL = 'SELECT case when count(*) > 0 then 100.00 - (' + @nullCheck + '
      ) * 100.00 / ' + cast(@colCount as nvarchar(max)) + '.00 / count(*) end as null_percent
      , case when count(*) > 0 then (' + @emptyCheck + '
      ) * 100.00 / ' + cast(@colCount as nvarchar(max)) + '.00 / count(*) end as empty_percent
FROM Production.Product
WHERE ProductID = ' + cast(@KeyToCheck as nvarchar(max))
;

print @SQL;

EXECUTE (@SQL)

I simplified one of your expressions: Instead of sum (case when <column> IS NULL then 1 else 0 end), you can just use count(<column>). When using count with an expression instead of *, it counts the rows where this expression is non-null. As this is the opposite from what you need, I added the 100.00 - as the start of the SELECT.

For the "empty check", this would make the logic more complex to understand, hence I left the original logic there and extended it. There, I implemented an check for emptiness for numeric and character/text data types. You can easily extend that for date, binary data etc. with whichever logic you use to determine if a column is empty.

I also found it more simple to leave first + in the two variables @nullCheck and @emptyCheck, as it is valid SQL to start an expression wit this.

I also extended the statement so that if there would potentially be more than one record with ProductId = 123, it shows the average across all records, i. e. the total sum divided by the count of rows. And the outermost case expressions just avoid an division by zero error if count(*) would be zero, i. e. no record with ProductId = 123 found. In that case the return value is null.

Upvotes: 2

Kaspars Ozols
Kaspars Ozols

Reputation: 7017

You could use AVG function:

SELECT AVG(CASE WHEN value IS NULL THEN 100 ELSE 0 END) AS Percents
FROM Table

UPDATE:

Here is your script:

DECLARE @SQL NVARCHAR(MAX), @TABLE_NAME NVARCHAR(MAX), @TABLE_SCHEMA NVARCHAR(MAX), @PK NVARCHAR(MAX)

SET @TABLE_NAME = 'tblBigTable'
SET @TABLE_SCHEMA = 'dbo'
SET @PK = '8'

SELECT
    @SQL = COALESCE(@SQL + ', ', 'SELECT ') +'AVG(CASE WHEN ' + COLUMN_NAME + ' IS NULL THEN 100 ELSE 0 END) AS [' + COLUMN_NAME +' NULL %]'
FROM 
    INFORMATION_SCHEMA.COLUMNS
WHERE 
    TABLE_SCHEMA = @TABLE_SCHEMA AND 
    TABLE_NAME = @TABLE_NAME

SET @SQL = @SQL + ' FROM ' + @TABLE_NAME + ' WHERE pkId = ''' + @PK + ''''

print @SQL

EXECUTE SP_EXECUTESQL @SQL

Upvotes: 0

Related Questions