Reputation: 11

SQL Server table with 1.2 million rows and 400+ columns very slow even on simple count(*)

I use SQL Server 2008 R2 and have a table copied from a DB2 table. I'm aware that the amount of rows is not normal, but this is historical and I can't do anything about that at the moment.

But a simple count of rows takes more than 2 minutes. The table consists of 3 columns with indexed customer ids then rest is fields with decimal numbers.

A search like this:

select 
    AD_ARBNUM, AD_SHBETSA, AD_AMBIDSHBETSA,AD_ASKATSHBETSA
from 
    lmoko 
where 
    AD_SHBETSA + AD_AMBIDSHBETSA + AD_ASKATSHBETSA < 0

where AD_ARBNUM is indexed and the rest is decimal takes 3+ minutes.

If run in DB2 the same query is run under 20 sec. (I do not know about indexing on the DB2 part)

Any suggestions to speed up performance here?

Upvotes: 0

Answers (3)

Diego

Reputation: 36126

if you are just worried about the count, you can run:

SELECT SUM (row_count) 
FROM sys.dm_db_partition_stats 
WHERE object_id = OBJECT_ID('lmoko ') AND (index_id=0 or index_id=1);

also, there are things you can you with your table besides indexes, like creating partitions to store old data for example.

Upvotes: 0

Philip Kelley

Reputation: 40309

(My answer fleshes out the first few comments, just to provide details and some context.)

SQL indexes are used to "look up" specific values. With an index on AD_ARBUM, SQL will find all rows with a specifc value (say, 12) in just about no time (unless, of course, half the table's rows are set to 12, in which case you'll have to read half the table). Your query filter is based on a formula based on several columns, none of which are indexed, so all those columns--across all 1.2 millions rows--will need to be read, to evaluate which to include and which not. If you built an index across all three of your formula columns (AD_SHBETSA, AD_AMBIDSHBETSA, AD_ASKATSHBETSA), it would still have to do the same mathematical formula on each. If you built an index on the formula itself

CREATE nonclustered INDEX IX_lmoko__ThreeColumnFormula
 on lmokok (AD_SHBETSA + AD_AMBIDSHBETSA + AD_ASKATSHBETSA

then SQL could search and filter based on the value you care about, and should be much faster (presuming, of course, that you don't end up returning all the rows after all.) The downside of having such an index is that you have to maintain the index; it will take up space, it can slow down inserts and updates, and if it's only used for a query you run once a month (as opposed to once a minute) that cost to performance and resources might be too high.

Also, as pointed out, there might be other factors involved, from poor hardware to resource contention to overall space (how wide is a row? How many columns are actually involved? Are those numeric values, or does SQL have to convert from strings? Lots of potential gotchas we can't know about without looking the system over)

Upvotes: 1

Grant Fritchey

Reputation: 2775

That WHERE clause is adding three different columns together and then comparing them to a value. In order to do that SQL SErver has to scan the entire table. Any kind of function against a column like this will lead to scans against the table. As the table grows, the queries will just get slower & slower.

You can create a computed column. Assuming you're running 2005 or better you can also put an index on that column.

Upvotes: 2

SQL Server table with 1.2 million rows and 400+ columns very slow even on simple count(*)

Answers (3)

Related Questions