Reputation: 105
I have created two options for a query used in a view which return the results I require. I need to rewrite either option so that it can be used in an Indexed View. Both fail when creating a unique clustered index on the view. The first one fails due to the LEFT OUTER JOIN, and the second fails due to a subquery. I believe both will also fail due to a self join.
After finding Creating Indexed Views, there is a large list of TSQL syntax elements that cannot be used. Among them: a derived table, UNION, EXCEPT, INTERSECT, subqueries, Outer or self joins, TOP, ORDER BY, DISTINCT, MAX...
The query should get the largest CompanyID
for each unique Company
. The StatusName
in the Statuses
table also needs to be displayed, and I'm only adding that in case it affects
the solution. It's currently an INNER JOIN
, so it's not causing a problem with creating the index.
Example for the Companies
table, with all 3 columns being INT
:
CompanyID Company Revision
1 1 1
2 1 2
3 2 1
4 2 2
The query should return:
CompanyID Company Revision
2 1 2
4 2 2
Here are the two options I've created:
SELECT t1.CompanyID, t1.Company, t1.Revision, Statuses.StatusName
FROM dbo.Companies AS t1
LEFT OUTER JOIN dbo.Companies AS t2
ON t1.Company = t2.Company AND t1.CompanyID < t2.CompanyID
INNER JOIN dbo.Statuses
ON dbo.Statuses.StatusID = t1.StatusID
WHERE t2.Company IS NULL
And the other:
SELECT t1.CompanyID, t1.Company, t1.Revision, Statuses.StatusName
FROM dbo.Companies AS t1
INNER JOIN dbo.Statuses
ON dbo.Statuses.StatusID = t1.StatusID
WHERE t1.Company NOT IN (SELECT t2.Company from dbo.Companies AS t2 WHERE t1.CompanyID < t2.CompanyID)
So, my question is, can either query be rewritten to be used in an Indexed View?
I'm using MS SQL Server 2008 R2, and 2005.
Upvotes: 3
Views: 11131
Reputation: 86706
Unfortunately, you can't.
Not only does your query necessitate a LEFT JOIN, you're LEFT JOINing the same table on to itself. And to quote BooksOnline and your question...
The SELECT statement in the view cannot contain the following Transact-SQL syntax elements:
- Outer or self joins.
An alternative option may be to create a real mapping table that you keep updated via a trigger. Records being created/deleted as Companies
changes, and records being updated as Statuses
changes.
Equally, views are expanded in-line into the queries in which they are used (unless you specifically say otherwise with the NOEXPAND
hint). Have you checked the execution plans of your queries to see if you could create more appropriate indexes on the base tables?
EDIT
An alternative query layout, just as an option...
;WITH
sequenced_companies
AS
(
SELECT
ROW_NUMBER() OVER (PARTITION BY Company ORDER BY CompanyID DESC) AS sequence_id,
*
FROM
dbo.companies
)
SELECT
*
FROM
sequenced_companies
INNER JOIN
dbo.statuses
ON statuses.StatusID = sequenced_companies.StatusID
WHERE
sequenced_companies.sequence_id = 1
With an index on (Company, CompanyID DESC)
, this should be quite quick. (Although still not suitable for an indexable view.)
Upvotes: 1
Reputation:
Instead of creating an exclusionary view, why not try the other way:
CREATE VIEW dbo.HighestCompany
AS
SELECT t1.CompanyID, t1.Company, t1.Revision, s.StatusName
FROM dbo.Companies AS t1
INNER JOIN (
SELECT Company, HighestCompany = MAX(CompanyID)
FROM dbo.Companies GROUP BY Company
) AS t2
ON t1.Company = t2.Company
AND t1.CompanyID = t2.HighestCompany -- not sure if CompanyID is unique
INNER JOIN dbo.Statuses AS s
ON s.StatusID = t1.StatusID;
You still won't be able to create an indexed view on this, but it may work a little better than the versions you currently have (depending on several factors, of course, including index on Company and selectivity).
Aside from that, I think to improve performance you'll need to take a look at the indexing strategy on the base tables. Why does your Companies table allow multiple companies with the same name and a different ID? Maybe that is part of the problem, and you should just store the currently relevant company in a separate table.
You could do this as follows (keep in mind I am guessing at data types and optimal indexes here):
CREATE SCHEMA hold AUTHORIZATION dbo;
GO
CREATE SCHEMA cache AUTHORIZATION dbo;
GO
CREATE TABLE dbo.HighestCompany
(
CompanyID INT,
Company NVARCHAR(255) PRIMARY KEY,
Revision INT,
StatusName NVARCHAR(64)
);
GO
CREATE TABLE cache.HighestCompany
(
CompanyID INT,
Company NVARCHAR(255) PRIMARY KEY,
Revision INT,
StatusName NVARCHAR(64)
);
GO
Now however often you think this data needs to be refreshed, you can run a job that does the following:
TRUNCATE TABLE cache.HighestCompany;
INSERT cache.HighestCompany(CompanyID, Company, Revision, StatusName)
SELECT t1.CompanyID, t1.Company, t1.Revision, s.StatusName
FROM dbo.Companies AS t1
INNER JOIN (
SELECT Company, HighestCompany = MAX(CompanyID)
FROM dbo.Companies GROUP BY Company
) AS t2
ON t1.Company = t2.Company
AND t1.CompanyID = t2.HighestCompany
INNER JOIN dbo.Statuses AS s
ON s.StatusID = t1.StatusID;
-- this is a fast, metadata operation that should result
-- in minimal blocking and disruption to end users:
BEGIN TRANSACTION;
ALTER SCHEMA hold TRANSFER dbo.HighestCompany;
ALTER SCHEMA dbo TRANSFER cache.HighestCompany;
ALTER SCHEME cache TRANSFER hold.HighestCompany;
COMMIT TRANSACTION;
If you find the companies change so often, or the data really needs to be up-to-the-second, that this isn't practical, you could do something similar with a trigger as @Dems suggested.
Upvotes: 5