Reputation: 942
CREATE TABLE [sql_table1] ([c0] varbinary(25) NOT NULL primary key)
go
insert into sql_table1 values (0x3200),(0x32);
go
I get
Cannot insert duplicate key in object 'dbo.sql_table'. The duplicate key value is (0x32).
Why? 0x32
does not equal 0x3200
Upvotes: 4
Views: 1178
Reputation: 21641
It gets right padded. BINARY
data gets tricky when you try to specify what should normally be equivalent numerically hex values. If you try this it will work:
insert into #sql_table1 values (0x32),(CAST(50 as VARBINARY(25)));
-- inserts 0x00000032
-- and 0x32
But these are numerically equivalent. Generally speaking, it's a bad idea to have a BINARY
column of any sort be a primary key or try to put a unique index on it (moreso than a CHAR
/VARCHAR
/NVARCHAR
column) - any application that inserts into it is going to almost certainly be CAST
ing from some native format/representation to binary, but there's no guarantee that that CAST
actually works in a unique manner in either direction - did the application insert the value 50
(= 0x32
), or did it try to insert the literal 0x32
, or did it try to insert the ASCII value of 2
(= 0x32
), or did it insert the first byte(s) of something else....? If one app submits 0x32
and another 0x0032
are they the same or different (SQL Server says different - see below)?
The bottom line is that SQL Server is going to behave unusually if you try to compare binary columns flat out without some context. What will work is comparing binary data using BINARY_CHECKSUM
SELECT BINARY_CHECKSUM(0x32) -- 50
SELECT BINARY_CHECKSUM(0x320) -- 16! -- it's treating this as having a different number or ordering of bytes
SELECT BINARY_CHECKSUM(0x3200) -- 50
SELECT BINARY_CHECKSUM(0x32000) -- 16
SELECT BINARY_CHECKSUM(0x0032) -- 50
SELECT BINARY_CHECKSUM(0x00032) -- 50
SELECT BINARY_CHECKSUM(0x000000000032) -- 50
but again, this only helps you see that the hexidecimal represenation of the binary data isn't going to work exactly the way it would seem. The point is, your primary key is going to be based on the BINARY_CHECKSUM
s of the data instead of any particular format/representation of the data. Normally that's a good thing, but with binary data (and padding) it becomes a lot trickier. Even then, in my example above the BINARY_CHECKSUM
of both columns will be exactly the same (SELECT BINARY_CHECKSUM(c0) FROM sql_table1
will output 50
for both rows). Weird - a little further testing is showing that any different number of leading 0s that fit into the column length will bypass the unique check even though the checksum is the same (e.g. VALUES (0x32), (0x032), (0x0032)
etc.).
This only gets worse if you start throwing different versions of SQL Server into the mix (per MSDN documentation).
What you should do for PK/Unique design on a table is figure out what context will make sense of this data - an order number, a file reference, a timestamp, a device ID, some other business or logical identifier, etc.... If nothing else, pseudokey it with an IDENTITY
column.
Upvotes: 2