taserian
taserian

Reputation: 623

SQL Server - Reference a column by name in aggregate function

Our application uses a series of tables with many varchar(1000) columns in them. During development of the system, that was considered to be more than enough space for the users to enter text. However, now we're encountering problems with text being cut off, so it seems like some users are entering more text that it can hold. However, I believe that there's rampant misuse of these fields, and I want to see which columns are being used.

From a different question that was answered here on SO, I've come up with the following query to extract the names of the varchar columns in a table:

select
   syscolumns.name as [Column],
   syscolumns.length as [MaxLength]
from 
   sysobjects, syscolumns 
where sysobjects.id = syscolumns.id
and   sysobjects.xtype = 'u'
and   sysobjects.name = 'TableWithTooManyVarcharColumns'
and syscolumns.xusertype = 167

but now I want to use the names of these columns in a new query, and calculate

SELECT 
[Column] as [Name],
[MaxLength],
MAX(LEN([Column])) as [MaxUsedLength],
AVG(LEN([Column])) as [AvgUsedLength],
STDEV(LEN([Column])) as [StdDev]
FROM TableWithTooManyVarcharColumns
INNER JOIN **{{ reference to table from query above }}**

Of course, the first query returns the name of the column, while the second one needs a reference to the column, so I'm not sure how to combine them properly. Any SQL Server gurus able to help?

Upvotes: 2

Views: 9081

Answers (2)

taserian
taserian

Reputation: 623

For the sake of completeness, I'm going to add what turned out to be my solution. It's patterned on swasheck's solution, so he gets full credit for his answer, and this solution is kind of messy, but it should serve as a foundation for any other people who might come across this problem:

DECLARE @TableName nvarchar(100) = 'TableToAnalyze' -- Change to your table name

declare @SQLStat nvarchar(4000)
declare @ColName nvarchar(100)
declare @MaxLength integer
declare @MaxUsedLength integer
declare @AvgUsedLength float
declare @StdDev float
declare @parm1IN nvarchar(100)
declare @parm2IN integer
declare @parm3IN integer
declare @parm4IN float
declare @parm5IN float
declare @parm1O integer
declare @parm2O float
declare @parm3O float

CREATE TABLE #Details (
    ColumnName nvarchar(100) NULL,
    MaxLength integer null,
    MaxUsedLength integer null,
    AvgUsedLength float null,
    StdDev float null
)

declare c cursor for 
    select c.name, c.max_length
    from sys.columns c 
        join sys.tables t
            on c.object_id = t.object_id
    where t.name = @TableName
    and c.user_type_id = 167

DECLARE @ParmDefinition1 NVARCHAR(500)

SET @ParmDefinition1 = N'@parm1IN nvarchar(100), 
                         @parm1O int OUTPUT, 
                         @parm2O float OUTPUT, 
                         @parm3O float OUTPUT'

DECLARE @ParmDefinition2 NVARCHAR(500)

SET @ParmDefinition2 = N'@parm1IN nvarchar(100), 
                         @parm2IN int, 
                         @parm3IN int, 
                         @parm4IN float, 
                         @parm5IN float'

open c 
fetch next from c into 
    @ColName, @MaxLength

    while @@FETCH_STATUS = 0
        begin
            set @SQLStat = N'SELECT @parm1O = MAX(LEN('+ QUOTENAME(@ColName) + ')),
                    @parm2O = AVG(LEN('+ QUOTENAME(@ColName) + ')),
                    @parm3O = STDEV(LEN('+ QUOTENAME(@ColName) + '))
                    FROM ' + QUOTENAME(@TableName)

            EXECUTE sp_executesql @SQLStat, @ParmDefinition1, 
            @parm1IN = @ColName,
            @parm1O = @MaxUsedLength OUTPUT,    
            @parm2O = @AvgUsedLength OUTPUT,
          @parm3O = @StdDev OUTPUT


            set @SQLStat = 'INSERT INTO #Details (  ColumnName, MaxLength, MaxUsedLength, AvgUsedLength, StdDev)'
            + ' VALUES(@parm1IN, @parm2IN, @parm3IN, @parm4IN, @parm5IN)'

            EXECUTE sp_executesql @SQLStat, @ParmDefinition2,
            @parm1IN = @ColName,
            @parm2IN = @MaxLength,
            @parm3IN = @MaxUsedLength,  
            @parm4IN = @AvgUsedLength,
          @parm5IN = @StdDev

            fetch next from c 
                into @ColName, @MaxLength
        end
    close c 
    deallocate c

SELECT * FROM #Details

DROP TABLE #Details

Upvotes: 2

swasheck
swasheck

Reputation: 4693

Either join two derived tables or use a CTE. By the way, you're using some outdated terminology for your schema references

select 
      DefinedName = sysdef.name,
      Columnname = tbl.name,
      DefinedLength = sysdef.max_length, 
      etc ...
from (
        select c.name, c.max_length
            from sys.columns c 
        join sys.tables t 
            on c.object_id = t.object_id
            where t.name = 'TableWithTooManyVarcharColumns'
) sysdef
join (
        SELECT 
                [Column] as [Name],
                [MaxLength],
                MAX(LEN([Column])) as [MaxUsedLength],
                AVG(LEN([Column])) as [AvgUsedLength],
                STDEV(LEN([Column])) as [StdDev]
        FROM TableWithTooManyVarcharColumns
) tbl
on sysdef.name = tbl.name

OR

;WITH sysdef AS (
        select c.name, c.max_length
                from sys.columns c 
            join sys.tables t 
                on c.object_id = t.object_id
                where t.name = 'TableWithTooManyVarcharColumns'
),
tbl AS (
        SELECT 
                [Column] as [Name],
                [MaxLength],
                MAX(LEN([Column])) as [MaxUsedLength],
                AVG(LEN([Column])) as [AvgUsedLength],
                STDEV(LEN([Column])) as [StdDev]
        FROM TableWithTooManyVarcharColumns
)
select
      DefinedName = sysdef.name,
      Columnname = tbl.name,
      DefinedLength = sysdef.max_length, 
      etc ...
from tbl t 
join sysdef d
on tbl.name = d.name

EDIT

declare @sql nvarchar(4000)
declare @colname nvarchar(100)
declare @max_length nvarchar(15)
declare c cursor for 
    select c.name, c.max_length
    from sys.columns c 
        join sys.tables t
            on c.object_id = t.object_id
    where t.name = 'TableWithTooManyVarcharColumns'

open c 
fetch next from c into 
    @colname, @max_length

    while @@FETCH_STATUS = 0
        begin
            set @sql = 'SELECT ''' + QUOTENAME(@colname) + ''', ' + @max_length + ' [MaxLength],' +                     
                    'MAX(LEN('+QUOTENAME(@colname)+')) as [MaxUsedLength],
                    AVG(LEN('+QUOTENAME(@colname)+')) as [AvgUsedLength],
                    STDEV(LEN('+QUOTENAME(@colname)+')) as [StdDev]
                    FROM TableWithTooManyVarcharColumns'
            exec(@sql)
            fetch next from c 
                into @colname, @max_length
        end
    close c 
    deallocate c

Upvotes: 4

Related Questions