Helping Optimizing a SQL Query

I have been reading a lot of Query Optimizations and I have been capable of optimizing most of them.

Nevertheless, I have a very complex query. It creates accumulated values for my accounting accounts. The query is taking more than 10 minutes to run, and I think that it should be a better way to optimize it, but I am not figuring out it.

The code that I want to optimize is this:

SELECT Empresa, IDCuenta, Año, Periodo, Saldo, 

((SELECT SUM(Saldo) 
FROM
(SELECT Empresa, IDCuenta, ReferenciaOrden,  SUM(Saldo) As Saldo
    FROM                                            
        (SELECT     Empresa, IDCuenta, ReferenciaOrden, SUM(Saldo) As Saldo 
        FROM dbo.GP_ContabilidadTrxActivas
        WHERE FechaTransacción<=GETDATE()
        GROUP BY    Empresa, IDCuenta, ReferenciaOrden
        UNION ALL
         SELECT     Empresa, IDCuenta, ReferenciaOrden, SUM(Saldo) As Saldo 
          FROM dbo.GP_ContabilidadTrxHistoricas
         WHERE FechaTransacción<=GETDATE()
         GROUP BY    Empresa, IDCuenta, ReferenciaOrden
         ) As Base
        GROUP BY Empresa, IDCuenta, ReferenciaOrden) As BaseInt
WHERE BaseInt.IDCuenta=BaseTotal.IDCuenta AND BaseInt.Empresa = BaseTotal.Empresa 
AND BaseInt.ReferenciaOrden<=BaseTotal.ReferenciaOrden
)) As SaldoAcumulado
FROM
(SELECT Empresa, IDCuenta, Año, Periodo, ReferenciaOrden, SUM(Saldo) As Saldo
                        FROM                                            
                          (SELECT     Empresa, IDCuenta, Año, Periodo, ReferenciaOrden, SUM(Saldo) As Saldo 
                          FROM dbo.GP_ContabilidadTrxActivas WITH (INDEX(IX_ReferenciaOrden)
                          WHERE FechaTransacción<=GETDATE()
                          GROUP BY    Empresa, IDCuenta, Año, Periodo,ReferenciaOrden
                           UNION ALL
                           SELECT     Empresa, IDCuenta, Año, Periodo, ReferenciaOrden, SUM(Saldo) As Saldo 
                           FROM dbo.GP_ContabilidadTrxHistoricas WITH (INDEX(IX_ReferenciaOrden)
                           WHERE FechaTransacción<=GETDATE()
                           GROUP BY    Empresa, IDCuenta, Año, Periodo,ReferenciaOrden
                           ) As Base
GROUP BY Empresa, IDCuenta, Año, Periodo, ReferenciaOrden) As BaseTotal

The indexes that I created for this query are:

CREATE NONCLUSTERED INDEX IX_ReferenciaOrden
ON dbo.GP_ContabilidadTrxHistoricas (IDCuenta ASC, ReferenciaOrden ASC)
INCLUDE (Empresa,  Año, Periodo, Saldo, FechaTransacción);


CREATE NONCLUSTERED INDEX IX_ReferenciaOrden
ON dbo.GP_ContabilidadTrxActivas (IDCuenta ASC, ReferenciaOrden ASC)
INCLUDE (Empresa,  Año, Periodo, Saldo, FechaTransacción);

The execution plans shows that 87% of the cost are in 3 activities: Index Seek, Stream Aggregate and Merge Join, this is the image of the partial estimated execution plan:

enter image description here

dbo.GP_ContabilidadTrxHistoricas has 3.559.617 rows and dbo.GP_ContabilidadTrxActivas has 102.707 rows

Any advise to optimize it would be more than welcome. Thanks in advance.

Upvotes: 1

Answers (3)

DRapp

Reputation: 48139

First, seeing the "INCLUDE" as part of your index confused me as never seeing that so I looked into it and found an excellent explanation/answer in this post. The important note as that the INCLUDE should be on fields that are NOT part of things like a group by. Your query DEFINITELY uses the columns as part of the group by and SHOULD be part of the normal covering index for query optimization.

Second, what is probably killing your time is that you are doing a correlated query for your column Saldo for every returned record in the baseline query thus killing performance running every time repeatedly. I would restructure your query to have the main FROM clause as those two queries run ONCE EACH and JOIN them on the columns respectively. It appears that for each deeper level item, you also want the parent level aggregation total. For example all sales within a given region is one column, but also including the total for comparison to the ENTIRE region. I may be incorrect, but that is what it appears to be.

So, I would just create your index as the following keys on each current and history transaction tables. The first 3 columns are specifically this order to match your higher level aggregations so THAT TOO is optimized without going to the granular level of Ano, Periodo, FechaTransaccion.

( Empresa, IdCuenta, ReferenciaOrden, Ano, Periodo, FechaTransaccion ) include ( saldo )

SELECT 
      BaseTotal.Empresa, 
      BaseTotal.IDCuenta, 
      BaseTotal.Año, 
      BaseTotal.Periodo, 
      BaseTotal.Saldo, 
      SUM( BaseInt.Saldo ) as OrdenSaldo
   FROM
      ( SELECT 
              Empresa,
              IDCuenta,
              ReferenciaOrden,
              Año,
              Periodo,
              SUM(Saldo) As Saldo
           FROM
              ( SELECT
                      Empresa,
                      IDCuenta,
                      ReferenciaOrden,
                      Año,
                      Periodo,
                      SUM(Saldo) As Saldo 
                   FROM 
                      dbo.GP_ContabilidadTrxActivas
                   WHERE 
                      FechaTransacción <= GETDATE()
                   GROUP BY
                      Empresa,
                      IDCuenta,
                      ReferenciaOrden,
                      Año,
                      Periodo
                UNION ALL
                SELECT
                      Empresa,
                      IDCuenta,
                      Año,
                      Periodo,
                      ReferenciaOrden,
                      SUM(Saldo) As Saldo 
                   FROM 
                      dbo.GP_ContabilidadTrxHistoricas
                   WHERE 
                      FechaTransacción <= GETDATE()
                   GROUP BY
                      Empresa,
                      IDCuenta,
                      ReferenciaOrden,
                      Año,
                      Periodo ) As Base
           GROUP BY 
              Empresa,
              IDCuenta,
              ReferenciaOrden,
              Año,
              Periodo ) As BaseTotal 
      JOIN
      ( SELECT 
              Empresa, 
              IDCuenta, 
              ReferenciaOrden,  
              SUM(Saldo) As Saldo
           FROM
              ( SELECT
                      Empresa,
                      IDCuenta,
                      ReferenciaOrden,
                      SUM(Saldo) As Saldo 
                   FROM 
                      dbo.GP_ContabilidadTrxActivas
                   WHERE 
                      FechaTransacción <= GETDATE()
                   GROUP BY
                      Empresa, 
                      IDCuenta, 
                      ReferenciaOrden
                UNION ALL
                SELECT
                      Empresa, 
                      IDCuenta, 
                      ReferenciaOrden, 
                      SUM(Saldo) As Saldo 
                   FROM 
                      dbo.GP_ContabilidadTrxHistoricas
                   WHERE 
                      FechaTransacción <= GETDATE()
                   GROUP BY
                      Empresa, 
                      IDCuenta, 
                      ReferenciaOrden ) As Base
           GROUP BY 
              Empresa, 
              IDCuenta, 
              ReferenciaOrden ) As BaseInt
         ON BaseTotal.Empresa = BaseInt.Empresa
         AND BaseTotal.IDCuenta = BaseInt.IDCuenta
         AND BaseInt.ReferenciaOrden <= BaseTotal.ReferenciaOrden 
   GROUP BY
      BaseTotal.Empresa, 
      BaseTotal.IDCuenta, 
      BaseTotal.Año, 
      BaseTotal.Periodo, 
      BaseTotal.Saldo, 
   ORDER BY
      BaseTotal.Empresa, 
      BaseTotal.IDCuenta, 
      BaseTotal.Año, 
      BaseTotal.Periodo

Upvotes: 2

GriGrim

Reputation: 2921

If you have SQL Server 2005 or higher you can try this:

DECLARE @tempTable TABLE (Empresa VARCHAR(100), IDCuenta INT, Año INT, Periodo INT, ReferenciaOrden INT, Saldo MONEY)

INSERT INTO @tempTable (Empresa, IDCuenta, Año, Periodo, ReferenciaOrden, Saldo)
SELECT Empresa, IDCuenta, Año, Periodo, ReferenciaOrden, SUM(Saldo) AS Saldo
FROM (
    SELECT Empresa, IDCuenta, Año, Periodo, ReferenciaOrden, Saldo
    FROM dbo.GP_ContabilidadTrxActivas
    WHERE FechaTransacción <= GETDATE()
    UNION ALL
    SELECT Empresa, IDCuenta, Año, Periodo, ReferenciaOrden, Saldo
    FROM dbo.GP_ContabilidadTrxHistoricas
    WHERE FechaTransacción <= GETDATE()
) AS Base
GROUP BY Empresa, IDCuenta, Año, Periodo, ReferenciaOrden

SELECT Empresa, IDCuenta, Año, Periodo, Saldo
    , (
        SELECT SUM(Saldo)
        FROM @tempTable AS BaseInt
        WHERE BaseInt.IDCuenta = BaseTotal.IDCuenta
            AND BaseInt.Empresa = BaseTotal.Empresa
            AND BaseInt.ReferenciaOrden <= BaseTotal.ReferenciaOrden
    ) AS SaldoAcumulado
FROM @tempTable AS BaseTotal

And also maybe creating index that contains FechaTransacción field can help. Because you filter tables by it.

Upvotes: 2

TomT

Reputation: 971

You are filtering by date, I would suggest you create your indexes like this

CREATE NONCLUSTERED INDEX IX_ReferenciaOrden
ON dbo.GP_ContabilidadTrxHistoricas (FechaTransacción)

If that doesn't help you, try to add the columns as they are in GROUP By clause. This way the index is sorted in the same way as required by GROUp BY

CREATE NONCLUSTERED INDEX IX_ReferenciaOrden
ON dbo.GP_ContabilidadTrxHistoricas (FechaTransacción, Empresa, IDCuenta, Año, Periodo,ReferenciaOrden)

If you still think this is slow, create the covering index with columns from select clause, this way the clustered index doesn't need to be accessed at all

CREATE NONCLUSTERED INDEX IX_ReferenciaOrden
ON dbo.GP_ContabilidadTrxHistoricas (FechaTransacción, Empresa, IDCuenta, Año, Periodo,ReferenciaOrden)
INCLUDE(Saldo)

You can also try to reformat the query using CTEs

Upvotes: 1

Helping Optimizing a SQL Query

Answers (3)

Related Questions