PriceCheaperton
PriceCheaperton

Reputation: 5349

How do I identify the column(s) responsible for “String or binary data would be truncated.”

I have an INSERT statement which looks like this:

INSERT INTO CLIENT_TABLE
SELECT NAME, SURNAME, AGE FROM CONTACT_TABLE

My example above is a basic one, but is there a way to pass in a SELECT statement and then check the returned column values against what the actual field sizes are?

Checking LEN against every column isnt practical. I am looking for something that is automated.

Upvotes: 1

Views: 2903

Answers (5)

Brian Pressler
Brian Pressler

Reputation: 6713

Here is some code to compare two row producing SQL statements to compare the columns. It takes as parameters two row-sets specified with server name, database name, and T-SQL query. It can compare data in different databases and even on different SQL Servers.

--setup parameters
declare @Server1 as varchar(128)
declare @Database1 as varchar(128)
declare @Query1 as varchar(max)
declare @Server2 as varchar(128)
declare @Database2 as varchar(128)
declare @Query2 as varchar(max)
set @Server1 = '(local)'
set @Database1 = 'MyDatabase'
set @Query1 = 'select * from MyTable' --use a select
set @Server2 = '(local)'
set @Database2 = 'MyDatabase2'
set @Query2 = 'exec MyTestProcedure....' --or use a procedure

--calculate statement column differences
declare @SQLStatement1 as varchar(max)
declare @SQLStatement2 as varchar(max)

set @Server1 = replace(@Server1,'''','''''')
set @Database1 = replace(@Database1,'''','''''')
set @Query1 = replace(@Query1,'''','''''')
set @Server2 = replace(@Server2,'''','''''')
set @Database2 = replace(@Database2,'''','''''')
set @Query2 = replace(@Query2,'''','''''')

CREATE TABLE #Qry1Columns(
    [colorder] [smallint] NULL,
    [ColumnName] [sysname] COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
    [TypeName] [sysname] COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,
    [prec] [smallint] NULL,
    [scale] [int] NULL,
    [isnullable] [int] NULL,
    [collation] [sysname] COLLATE SQL_Latin1_General_CP1_CI_AS NULL
) ON [PRIMARY]

CREATE TABLE #Qry2Columns(
    [colorder] [smallint] NULL,
    [ColumnName] [sysname] COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
    [TypeName] [sysname] COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,
    [prec] [smallint] NULL,
    [scale] [int] NULL,
    [isnullable] [int] NULL,
    [collation] [sysname] COLLATE SQL_Latin1_General_CP1_CI_AS NULL
) ON [PRIMARY]

set @SQLStatement1 =
'SELECT *
INTO #Qry1
FROM OPENROWSET(''SQLNCLI'', 
    ''server=' + @Server1 + ';database=' + @Database1 + ';trusted_connection=yes'',
    ''select top 0 * from (' + @Query1 + ') qry'')

select colorder, syscolumns.name ColumnName, systypes.name TypeName, syscolumns.prec, syscolumns.scale, syscolumns.isnullable, syscolumns.collation
from tempdb.dbo.syscolumns
join tempdb.dbo.systypes
on syscolumns.xtype = systypes.xtype
where id = OBJECT_ID(''tempdb.dbo.#Qry1'')
order by 1'

insert into #Qry1Columns
exec(@SQLStatement1)


set @SQLStatement2 =
'SELECT *
INTO #Qry1
FROM OPENROWSET(''SQLNCLI'', 
    ''server=' + @Server2 + ';database=' + @Database2 + ';trusted_connection=yes'',
    ''select top 0 * from (' + @Query2 + ') qry'')

select colorder, syscolumns.name ColumnName, systypes.name TypeName, syscolumns.prec, syscolumns.scale, syscolumns.isnullable, syscolumns.collation
from tempdb.dbo.syscolumns
join tempdb.dbo.systypes
on syscolumns.xtype = systypes.xtype
where id = OBJECT_ID(''tempdb.dbo.#Qry1'')
order by 1'

insert into #Qry2Columns
exec(@SQLStatement2)

select ISNULL( #Qry1Columns.colorder, #Qry2Columns.colorder) ColumnNumber,
    #Qry1Columns.ColumnName ColumnName1,
    #Qry1Columns.TypeName TypeName1,
    #Qry1Columns.prec prec1,
    #Qry1Columns.scale scale1,
    #Qry1Columns.isnullable isnullable1,
    #Qry1Columns.collation collation1,
    #Qry2Columns.ColumnName ColumnName2,
    #Qry2Columns.TypeName TypeName2,
    #Qry2Columns.prec prec2,
    #Qry2Columns.scale scale2,
    #Qry1Columns.isnullable isnullable2,
    #Qry2Columns.collation collation2
from #Qry1Columns
join #Qry2Columns
on #Qry1Columns.colorder=#Qry2Columns.colorder

You can tweak the finally select statement to highlight any differences that you wish. You can also wrap this up in a procedure and make a nice little user interface for it if you like, so that it's literally a cut and paste away to quick results.

Upvotes: 0

japzdivino
japzdivino

Reputation: 1746

My debugging in that kind of problem is..

I am removing columns in the SELECT one by one, if did not return error, then you know what column is the cause of truncation problem.. but here are some tips on debugging.

  • Option 1: Start first with the columns that hold more character.. like VARCHAR, for example in your case, i think the column NAME, SURNAME are the one causes an error since AGE column does not hold many characters because its integer. You should debug something like that.

  • Option 2: You can investigate the column in your final output. The final SELECT will return all columns and its values, then you can counter check if the values matches what you input on the UI etc.

    Ex. See the Expected vs. Actual Output result on the image below

    Expected:

    enter image description here

    Actual Output:

    enter image description here

    My example in option 2 shows that the truncated string is the SURNAME as you can see..

    NOTE: You can only use the Option 2 if the query did not return execution error, meaning to say that the truncated string did not return an error BUT created an unexpected split string which we don't want.

    IF the query return an error, your best choice is Option 1, which consume more time but worth it, because that is the best way to make sure you found the exact column that causes the truncation problem

Then if you already found the columns that causes the problem, you can now adjust the size of the column or another way is to limit the input of the user ?, you can put some validation to users to avoid truncation problem, but it is all up to you on how you want the program works depending on your requirement.

My answers/suggestion is base on my experience in that kind of situation.

Hope this answer will help you. :)

Upvotes: 2

Ubercoder
Ubercoder

Reputation: 741

A manual solution is very quick if you are using SQL Server Manager Studio (SSMS). First capture the table structure of your SELECT statement into a working table:

SELECT COL1, COL2, ... COL99 INTO dbo.zz_CONTACT_TABLE 
FROM CONTACT_TABLE WHERE 1=0;

Then in SSMS, right-click your original destination table (CLIENT_TABLE) and script it as create to a new SSMS window. Then right-click your working table (zz_CONTACT_TABLE) and script the creation of this table to a second SSMS window. Arrange both windows side by side and check the columns of zz_CONTACT_TABLE against CLIENT_TABLE. Differences in length and out-of-order columns will be immediately seen, even if there are hundreds of output columns.

Finally drop your working table:

DROP TABLE dbo.zz_CONTACT_TABLE;

Regarding an automated solution, it is difficult to see how this could work. Basically you are comparing a destination table (or a subset of columns in a destination table) against the output of a SELECT statement. I suppose you could write a stored procedure that takes two varchar parameters: the name of the destination table and the SELECT statement that would populate it. But this would not handle the case where only some columns of the destination are populated, and it would be more work than the manual solution above.

Upvotes: 0

Vladimir Baranov
Vladimir Baranov

Reputation: 32685

There is a great answer by Aaron Bertrand to the question: Retrieve column definition for stored procedure result set

If you used SQL Server 2012+ you could use sys.dm_exec_describe_first_result_set. Here is a nice article with examples. But, even in SQL Server 2008 it is possible to retrieve the types of columns of the query. Aaron's answer explains it in details.

In fact, in your case it is easier, since you have a SELECT statement that you can copy-paste, not something that is hidden in a stored procedure. I assume that your SELECT is a complex query returning columns from many tables. If it was just one table you could use sys.columns with that table directly.

So, create an empty #tmp1 table based on your complex SELECT:

SELECT TOP(0)
    NAME, SURNAME, AGE
INTO #tmp1 
FROM CONTACT_TABLE;

Create a second #tmp2 table based on the destination of your complex SELECT:

SELECT TOP(0)
    NAME, SURNAME, AGE
INTO #tmp2
FROM CLIENT_TABLE;

Note, that we don't need any rows, only columns for metadata, so TOP(0) is handy. Once those #tmp tables exist, we can query their metadata using sys.columns and compare it:

WITH
CTE1
AS
(
    SELECT
        c.name AS ColumnName
        ,t.name AS TypeName
        ,c.max_length
        ,c.[precision]
        ,c.scale
    FROM
        tempdb.sys.columns AS c
        INNER JOIN tempdb.sys.types AS t ON
                c.system_type_id = t.system_type_id
            AND c.user_type_id = t.user_type_id
    WHERE
        c.[object_id] = OBJECT_ID('tempdb.dbo.#tmp1')
)
,CTE2
AS
(
    SELECT
        c.name AS ColumnName
        ,t.name AS TypeName
        ,c.max_length
        ,c.[precision]
        ,c.scale
    FROM
        tempdb.sys.columns AS c
        INNER JOIN tempdb.sys.types AS t ON
                c.system_type_id = t.system_type_id
            AND c.user_type_id = t.user_type_id
    WHERE
        c.[object_id] = OBJECT_ID('tempdb.dbo.#tmp2')
)
SELECT *
FROM
    CTE1
    FULL JOIN CTE2 ON CTE1.ColumnName = CTE2.ColumnName
WHERE
    CTE1.TypeName <> CTE2.TypeName
    OR CTE1.max_length <> CTE2.max_length
    OR CTE1.[precision] <> CTE2.[precision]
    OR CTE1.scale <> CTE2.scale
;

Another possible way to compare:

WITH

... as above ...

SELECT * FROM CTE1

EXCEPT 

SELECT * FROM CTE2
;

Finally

DROP TABLE #tmp1;
DROP TABLE #tmp2;

You can tweak the comparison to suit your needs.

Upvotes: 1

Irfan Siddiq
Irfan Siddiq

Reputation: 19

Check max length for each field, this way you can identify the fields that are over char limit specified in your table e.g CLIENT_TABLE.

    SELECT Max(Len(NAME)) MaxNamePossible
           , Max(Len(SURNAME)) MaxSurNamePossible
           , Max(Len(AGE)) MaxAgePossible 
    FROM CONTACT_TABLE

Compare the result with Client_Table Design Like if in Client_Table "Name" is of Type Varchar(50) and validation query( written above) return more than 50 chars than "Name" field is causing over flow.

Upvotes: 1

Related Questions