Antediluvian
Antediluvian

Reputation: 733

When I retrieve a non-indexed column from a large table, SQL Server looks up the PK

I have a huge table which has millions of rows:

create table BigTable
(
    id bigint IDENTITY(1,1) NOT NULL,
    value float not null,
    vid char(1) NOT NULL,
    dnf char(4) NOT NULL,
    rbm char(6) NOT NULL,
    cnt char(2) not null,
    prs char(1) not null,
    ...
    PRIMARY KEY CLUSTERED (id ASC)
                WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, 
                      IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, 
                      ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
)

There is a non-clustered unique index on this table which is on the following columns:

vid, dnf, rbm, cnt, prs

Now I would like to get some data out of this huge table with a partial index:

insert into #t
    select b.id, b.rbm, b.value
    from 
        (select distinct x from #t1) a
    join 
        BigTable b on b.vid = '1'
                   and b.dnf = '1234'
                   and b.rbm = a.x

where:

create table #t
(
    id  integer primary key,
    rbm varchar(8),
    val float null
)

create index tmp_idx_t on #t(rbm)

create table #t1 (rbm char(6) not null)

If I don't include the value column in the query, the execution plan will not show the PK lookup on BigTable. But if I have the value in the resultset, a PK lookup will be on BigTable. The output list of this lookup is the very column val.

But I am not using any PKs here so it takes a lot of time to finish. Is there a way to stop that PK lookup? Or am I writing wrong SQL?

I know the BigTable is not a great design but I can't change it.

Upvotes: 1

Views: 738

Answers (2)

The Impaler
The Impaler

Reputation: 48865

In order to avoid the PK index lookup you'll need to include all the columns (filtering predicate AND result set) in the index. For example:

create index ix1 on BigTable (dnf, vid, rbm) include (value);

This index only uses the first three columns (dnf, vid, rbm) for search, and adds (value) columns as data. As pointed out by @DanGusman the id column is always present in a SQL Server secondary index. This way the index has all the data it needs to resolve the query.

Alternatively, you could use the old way:

create index ix1 on BigTable (dnf, vid, rbm, value);

This will also work, but will generate a heavier index. That being said, this heavier index may also be of use for other queries.

Upvotes: 3

Gordon Linoff
Gordon Linoff

Reputation: 1270873

Your index is lacking the id column. In order to resolve the query, the engine needs to go to the data pages. I would advise you include the column in the index, so the index covers the query.

I am thinking that you might try writing the query as:

select b.id, b.rbm, b.value
from BigTable b
where b.vid = '1' and
      b.dnf = '1234' and
      exists (select 1 from #t1 t1 where t1.x = b.rbm);

This might encourage SQL Server to use a more optimal execution plan, even when id is not in the index.

Upvotes: 0

Related Questions