Cedric Aube
Cedric Aube

Reputation: 345

What are the best practices for creating indexes on multiple bit columns?

Good day,

In SQL Server 2005, I have a table numerous columns, including a few boolean (bit) columns. For example, table 'Person' has columns ID and columns HasItem1, HasItem2, HasItem3, HasItem4. This table is kinda large, so I would like to create indexes to get faster search results.

I know that is not I good idea to create an index on a bit column, so I thought about using a index with all of the bit columms. However, the thing is, all of these bit columns may or may not be in the query. Since the order of the indexed columns are important in an index, and that I don't know which ones will be used in the query, how should I handle this?

BTW, there is already clustered index that I can't remove.

Upvotes: 1

Views: 1106

Answers (6)

mrdenny
mrdenny

Reputation: 5078

Odds are it will be easier for SQL to query the large table with the person_id and item_id and BitValue then it will be to search a single table with Item1, Item2, ... ItemN.

Upvotes: 0

HardCode
HardCode

Reputation: 6766

You should revisit the design of your database. Instead of having a table with fields HasItem1 to HasItem#, you should create a bridge entity, and a master Items table if you don't have one. The bridge entity (table), person_items, would have (a minimum of) two fields: person_id and item_id.

Designing the database this way doesn't lock you in to a database that only handles N number of items based on column definitions. You can add as many items as you want to a master Items table, and associate as many of them as you need with as many people as you need.

Upvotes: 0

Sam
Sam

Reputation: 7678

How about using checksum?

Add a int field named mysum to your table and execute this

UPDATE checksumtest SET mysum = CHECKSUM(hasitem1,hasitem2,hasitem3,hasitem4) 

Now you have a value that represents the combination of bits.

Do the same checksum calc in your search query and match on mysum.

This may speed things up.

Upvotes: 0

HLGEM
HLGEM

Reputation: 96600

I don't know about 2005 but in SQL Server 2000 (From Books Online): "Columns of type bit cannot have indexes on them."

Upvotes: 0

dnewcome
dnewcome

Reputation: 2075

I don't know a lot of specifics on sql server, but in general indexing a column that has non-unique data is not very effective. In some RDBMS systems, the optimizer will ignore indexes that are less than a certain percent unique anyway, so the index may as well not even exist.

Using a composite, or multi-column index can help, but only in particular cases where the filter constraints are in the same order that the index was built in. If you index includes 'field1, field2' and you are searching for 'field2, field1' or some other combination, the index may not be used. You could add an index for each of the particular search cases that you want to optimize, that is really all I can think of that you could do. And in the case that your data is not very unique, even after considering all of the bit fields, the index may be ignored anyway.

For example, if you have 3 bit fields, you are only segmenting your data into 8 distinct groups. If you have a reasonable number of rows in the table, segmenting it by 8 isn't going to be very effective.

Upvotes: 2

Steve Weet
Steve Weet

Reputation: 28402

I would suggest that this is probably not a good idea. Trying to index fields with very low cardinality will generally not make queries faster and you have the overhead of maintaining the index as well.

If you generally search for one of your bit fields with another field then a composite index on the two fields would probably benefit you.

If you were to create a composite index on the bit fields then this would help but only if the composite fields at the beginning of the index were provided. If you do not include the 1st value within the composite index then the index will probably not be used at all.

If, as an example bita was used in 90% of your queries and bitd in 70% and bits b and c in 20% then a composite index on (bita, bitd, bitb, bitc) would probably yield some benefit but for at least 10% of your queries and possibly even 40% the index would most likely not be used.

The best advice is probably to try it with the same data volumes and data cardinality and see what the Execution plan says.

Upvotes: 2

Related Questions