Cannot create CLUSTERED INDEX on a View due to LEFT JOIN or subquery

Question

I have created two options for a query used in a view which return the results I require. I need to rewrite either option so that it can be used in an Indexed View. Both fail when creating a unique clustered index on the view. The first one fails due to the LEFT OUTER JOIN, and the second fails due to a subquery. I believe both will also fail due to a self join.

After finding Creating Indexed Views, there is a large list of TSQL syntax elements that cannot be used. Among them: a derived table, UNION, EXCEPT, INTERSECT, subqueries, Outer or self joins, TOP, ORDER BY, DISTINCT, MAX...

The query should get the largest CompanyID for each unique Company. The StatusName in the Statuses table also needs to be displayed, and I'm only adding that in case it affects the solution. It's currently an INNER JOIN, so it's not causing a problem with creating the index.

Example for the Companies table, with all 3 columns being INT:

CompanyID Company Revision
1         1       1
2         1       2
3         2       1
4         2       2

The query should return:

CompanyID Company Revision
2         1       2
4         2       2

Here are the two options I've created:

SELECT t1.CompanyID, t1.Company, t1.Revision, Statuses.StatusName
FROM dbo.Companies AS t1

LEFT OUTER JOIN dbo.Companies AS t2
ON t1.Company = t2.Company AND t1.CompanyID < t2.CompanyID

INNER JOIN dbo.Statuses
ON dbo.Statuses.StatusID = t1.StatusID

WHERE t2.Company IS NULL

And the other:

SELECT t1.CompanyID, t1.Company, t1.Revision, Statuses.StatusName
FROM dbo.Companies AS t1

INNER JOIN dbo.Statuses
ON dbo.Statuses.StatusID = t1.StatusID

WHERE t1.Company NOT IN (SELECT t2.Company from dbo.Companies AS t2 WHERE t1.CompanyID < t2.CompanyID)

So, my question is, can either query be rewritten to be used in an Indexed View?

I'm using MS SQL Server 2008 R2, and 2005.

anon · Accepted Answer

Instead of creating an exclusionary view, why not try the other way:

CREATE VIEW dbo.HighestCompany
AS
  SELECT t1.CompanyID, t1.Company, t1.Revision, s.StatusName
    FROM dbo.Companies AS t1
    INNER JOIN (
      SELECT Company, HighestCompany = MAX(CompanyID) 
      FROM dbo.Companies GROUP BY Company
    ) AS t2
    ON t1.Company = t2.Company
    AND t1.CompanyID = t2.HighestCompany -- not sure if CompanyID is unique
    INNER JOIN dbo.Statuses AS s
    ON s.StatusID = t1.StatusID;

You still won't be able to create an indexed view on this, but it may work a little better than the versions you currently have (depending on several factors, of course, including index on Company and selectivity).

Aside from that, I think to improve performance you'll need to take a look at the indexing strategy on the base tables. Why does your Companies table allow multiple companies with the same name and a different ID? Maybe that is part of the problem, and you should just store the currently relevant company in a separate table.

You could do this as follows (keep in mind I am guessing at data types and optimal indexes here):

CREATE SCHEMA hold AUTHORIZATION dbo;
GO
CREATE SCHEMA cache AUTHORIZATION dbo;
GO
CREATE TABLE dbo.HighestCompany
(
  CompanyID INT, 
  Company NVARCHAR(255) PRIMARY KEY,
  Revision INT,
  StatusName NVARCHAR(64)
);
GO
CREATE TABLE cache.HighestCompany
(
  CompanyID INT, 
  Company NVARCHAR(255) PRIMARY KEY,
  Revision INT,
  StatusName NVARCHAR(64)
);
GO

Now however often you think this data needs to be refreshed, you can run a job that does the following:

TRUNCATE TABLE cache.HighestCompany;

INSERT cache.HighestCompany(CompanyID, Company, Revision, StatusName)
SELECT t1.CompanyID, t1.Company, t1.Revision, s.StatusName
        FROM dbo.Companies AS t1
        INNER JOIN (
          SELECT Company, HighestCompany = MAX(CompanyID) 
          FROM dbo.Companies GROUP BY Company
        ) AS t2
        ON t1.Company = t2.Company
        AND t1.CompanyID = t2.HighestCompany
        INNER JOIN dbo.Statuses AS s
        ON s.StatusID = t1.StatusID;

-- this is a fast, metadata operation that should result
-- in minimal blocking and disruption to end users:
BEGIN TRANSACTION;
  ALTER SCHEMA hold TRANSFER dbo.HighestCompany;
  ALTER SCHEMA dbo TRANSFER cache.HighestCompany;
  ALTER SCHEME cache TRANSFER hold.HighestCompany;
COMMIT TRANSACTION;

If you find the companies change so often, or the data really needs to be up-to-the-second, that this isn't practical, you could do something similar with a trigger as @Dems suggested.

Cannot create CLUSTERED INDEX on a View due to LEFT JOIN or subquery

Answers (2)

Related Questions