vbNewbie
vbNewbie

Reputation: 3345

hash functions-sql studio express

I need to create a hash key on my tables for uniqueness and someone mentioned to me about md5. But I have read about checksum and binary sum; would this not serve the same purpose? To ensure no duplicates in a specific field. Now I managed to implement this and I see the hask keys in my tables. Do I need to alter index keys originally created since I created a new index key with these hash keys? Also do I need to change the keys? How do I change my queries for example SELECT statements? I guess I am still unsure how hash keys really help in queries other than uniqueness?

Upvotes: 2

Views: 748

Answers (5)

SQLMenace
SQLMenace

Reputation: 135111

why don't you use a GUID with a default of NEWSEQUENTIALID() ..don't use NEWID() since it is horrible for clustering, see here: Best Practice: Do not cluster on UniqueIdentifier when you use NewId

make this column the primary key and you are pretty much done

Upvotes: 0

Luka Rahne
Luka Rahne

Reputation: 10457

Cryptographically save Hash functions are one way functions and they consume more resources (CPU cycles) that functions that are not cryptographically secure. If you just need function as hash key you do not need such property. All you need is low probability for collisions what is related whit uniformity. Try whit CRC or if you have strings or modulo for numbers.

http://en.wikipedia.org/wiki/Hash_function

Upvotes: 0

gbn
gbn

Reputation: 432421

It makes no sense to write a unique function to replace SQL Server unique constraints/indexes.

How are you going to ensure the hash is unique? With a constraint?

If you index it (which may not be allowed because of determinism), then the optimiser will treat it as non-unique. As well as killing performance.

And you only have a few 100,000 rows. Peanuts.

Given time I could come up with more arguments, but I'll summarise: Don't do it

Upvotes: 2

Joel Coehoorn
Joel Coehoorn

Reputation: 416039

There's always the HashBytes() function. It supports md5, but if you don't like it there's an option for sha1.

As for how this can help queries: one simple example is if you have a large varchar column — maybe varchar max — and in your query you want to know if the contents of this column match a particular string. If you have to compare your search with every single record it could be slow. But if you hash your search string and use that, things can go much faster since now it's just a very short binary compare.

Upvotes: 1

Quick Joe Smith
Quick Joe Smith

Reputation: 8232

If your goal is to ensure no duplicates in a specific field, why not just apply a unique index to that field and let the database engine do what it was meant to do?

Upvotes: 2

Related Questions