ParoX
ParoX

Reputation: 5941

How to create indexed view of children count

I am trying to take a table with a parent child relationship and get the number of children. I would like to create an indexed view of the number of children by utilizing COUNT_BIG(*).

The problem is that in my index view I don't want to eliminate the entities who have no children, instead I want the Count to be 0 for those.

Given

> Id | Entity | Parent
> -: | :----- | :-----
>  1 | A      | null  
>  2 | AA     | A     
>  3 | AB     | A     
>  4 | ABA    | AB    
>  5 | ABB    | AB    
>  6 | AAA    | AA    
>  7 | AAB    | AA    
>  8 | AAC    | AA    

I want to create an indexed view that returns

> Entity | Count
> :----- | ----:
> A      |     2
> AA     |     3
> AB     |     2
> ABA    |     0
> ABB    |     0
> AAA    |     0
> AAB    |     0
> AAC    |     0

Here is my SQL that works, but using a LEFT JOIN and a CTE (both are not allowed in an index view)

    DROP TABLE IF EXISTS Example
    CREATE TABLE Example (
      Id INT primary key,
      Entity varchar(50),
      Parent varchar(50)
    )
    
    
    INSERT INTO Example
    VALUES 
       (1, 'A', NULL)
      ,(2, 'AA',  'A')
      ,(3, 'AB','A')
      ,(4, 'ABA', 'AB')
      ,(5, 'ABB', 'AB')
      ,(6, 'AAA', 'AA')
      ,(7, 'AAB', 'AA')
      ,(8, 'AAC', 'AA')
    
    
    
    SELECT *
    FROM Example
    
    ;WITH CTE AS (
     SELECT Parent, COUNT(*) as Count
      FROM dbo.Example
      GROUP BY Parent
    )
      
    SELECT e.Entity, COALESCE(Count,0) Count
    FROM dbo.Example e
    LEFT JOIN CTE g
    ON e.Entity = g.Parent


GO

Upvotes: 4

Views: 780

Answers (3)

ParoX
ParoX

Reputation: 5941

I was able to accomplish what I was after by doing a cartesian join on the rows which would be 0 count (N=2).

Create table called two rows which will duplicate the grandchildren

DROP TABLE IF EXISTS TwoRows
CREATE TABLE TwoRows (
    N INT primary key
)

INSERT INTO TwoRows
VALUES (1),(2)

Get the original table

DROP TABLE IF EXISTS Example
CREATE TABLE Example (
    Id INT primary key,
    Entity varchar(50),
    Parent varchar(50)
)


INSERT INTO Example
VALUES 
     (1, 'A', NULL)
    ,(2, 'AA',  'A')
    ,(3, 'AB','A')
    ,(4, 'ABA', 'AB')
    ,(5, 'ABB', 'AB')
    ,(6, 'AAA', 'AA')
    ,(7, 'AAB', 'AA')
    ,(8, 'AAC', 'AA')

Create the indexed view

DROP VIEW IF EXISTS dbo.indexedView 
CREATE VIEW dbo.indexedView  WITH SCHEMABINDING AS
    SELECT 
        IIF(T.N = 2, Entity, Parent) as Entity
        , COUNT_BIG(*) as CountPlusOne
        , COUNT_BIG(ALL IIF(T.N = 2, NULL, 1)) as Count
    FROM dbo.Example E1
    INNER JOIN dbo.TwoRows T
        ON 1=1
    WHERE IIF(T.N = 2, Entity, Parent) IS NOT NULL
    GROUP BY IIF(T.N = 2, Entity, Parent)
GO

CREATE UNIQUE CLUSTERED INDEX testIndex ON indexedView(Entity)

SELECT *
FROM indexedView

I wasn't able how to avoid using COUNT_BIG(*)

Upvotes: 2

Hadi
Hadi

Reputation: 37368

I don't think you can achieve that using a CTE neither a LEFT JOIN because there are many restriction using the indexed views.

Workaround

I suggest splitting the query into two part:

  1. Create an indexed view instead of a common table expression (CTE)
  2. Create a non indexed view that perform the LEFT JOIN

Beside that, create a Non-Clustered index on Entity column in Table Example .

Then when you query the non-indexed view, it will use indexes

--CREATE TABLE
CREATE TABLE Example (
  Id INT primary key,
  Entity varchar(50),
  Parent varchar(50)
)

--INSERT VALUES
INSERT INTO Example
VALUES 
   (1, 'A', NULL)
  ,(2, 'AA',  'A')
  ,(3, 'AB','A')
  ,(4, 'ABA', 'AB')
  ,(5, 'ABB', 'AB')
  ,(6, 'AAA', 'AA')
  ,(7, 'AAB', 'AA')
  ,(8, 'AAC', 'AA')

--CREATE NON CLUSTERED INDEX
CREATE NONCLUSTERED INDEX idx1 ON dbo.Example(Entity);

--CREATE Indexed View

CREATE VIEW dbo.ExampleView_1
    WITH SCHEMABINDING
    AS 
 SELECT Parent, COUNT_BIG(*) as Count
  FROM dbo.Example
  GROUP BY Parent

CREATE UNIQUE CLUSTERED INDEX idx ON dbo.ExampleView_1(Parent);

--Create non-indexed view
CREATE VIEW dbo.ExampleView_2
    WITH SCHEMABINDING
    AS 
    SELECT e.Entity, COALESCE(Count,0) Count
    FROM dbo.Example e
    LEFT JOIN dbo.ExampleView_1 g
    ON e.Entity = g.Parent

So when you perform the following query:

SELECT * FROM dbo.ExampleView_2 WHERE Entity = 'A'

You can see that the view Clustered index and the Table Non-Clustered index are used in the execution plan:

enter image description here

Additional Information

I didn't find additional workarounds to replace the use of LEFT JOIN or UNION or CTE in indexed views, you can check many similar Stackoverflow questions:


Update 1 - Splitting view vs. Cartesian join

To identify the better approach, i tried to compare both suggested approaches.

--The other approach (cartesian join)
CREATE TABLE TwoRows (
    N INT primary key
)

INSERT INTO TwoRows
VALUES (1),(2)

CREATE VIEW dbo.indexedView  WITH SCHEMABINDING AS
    SELECT 
        IIF(T.N = 2, Entity, Parent) as Entity
        , COUNT_BIG(*) as CountPlusOne
        , COUNT_BIG(ALL IIF(T.N = 2, NULL, 1)) as Count
    FROM dbo.Example E1
    INNER JOIN dbo.TwoRows T
        ON 1=1
    WHERE IIF(T.N = 2, Entity, Parent) IS NOT NULL
    GROUP BY IIF(T.N = 2, Entity, Parent)
GO

CREATE UNIQUE CLUSTERED INDEX testIndex ON indexedView(Entity)

I created each indexed view on seperate databases and performed the following query:

SELECT * FROM View WHERE Entity = 'AA'

Splitting view

enter image description here

Cartesian Join

enter image description here

Time Statistics

The time statistics shows that the Cartesian join approach execution time is higher than the Splitting view approach, as shown in the image below (cartesian join to the right):

enter image description here

Adding WITH(NOEXPAND)

Also i tried to add WITH(NOEXPAND) option the the cartesian join approach, to force the database engine to use the indexed view clustered index and the result was as following:

enter image description here

I cleared all caches and perform a comparison, the Time statistics comparison shows that the Splitting view approach is still faster than the cartesian join approach (WITH(NOEXPAND) approach to the right):

enter image description here

Upvotes: 3

gotqn
gotqn

Reputation: 43666

You can create AFTER INSERT,UPDATE, DELETE trigger on your example table and a new table to materialized the results.

In the trigger, you are allowed to use any statement. You can do this in two ways, depending on how fast is your query initial query.

For example, you can truncate the table on every INSERT/UPDATE/DELETE and then to calculate the count and insert it again (if the query is fast).

Or you can rely on inserted and deleted tables which are special tables visible in the trigger context and showing how rows values have changed.

For example, if a record exists in the inserted table and not in the deleted - this is(are) a new row(s). You can calculate the COUNT only for them.

If a record exists only in the deleted table - this is a deletion (we need to remove the row for our precalculated table).

And a row exists in both tables - these is an update - we need to perform new count for the record.

One thing is very important here - do not manipulate the rows one by one. Always work in batches of rows for the three cases above or you will end up with poor performing trigger which will delay the CRUD operations with the original table.

Upvotes: 0

Related Questions