Reputation: 165
There are 4 tables as follows: the table [T_D ]
has about 5 million rows and other tables have less than 5000 rows; [T_D ]
has three primary key columns: c_id, datetime, a_id
.
The relations between the 4 tables are as follows:
table T_D
(
c_id numeric, -- primary key, reference on (M_C.c_id )
datetime datetime, -- primary key
a_id numeric, -- primary key, reference on (M_A.a_id )
data float
)
table M_C
(
c_id numeric, -- primary key
...
)
table M_B
(
b_id numeric, -- primary key
...
)
table M_A
(
a_id numeric, -- primary key
b_id numeric -- reference on (M_B.b_id)
...
)
I'm using SQL Server 2008R2, it takes several seconds to run the following SQL statement. I have tried some solutions to fix this problem, but it doesn't work well, it still takes about 10 seconds to complete running.
How to optimize this SQL query, and to minimize the executing time to less than 200 milliseconds? There are about 120 records in the date range of the following sql query.
The SQL query to optimize:
SELECT
C.c_id, SUM(T.DATA) DATA
FROM
T_D T
LEFT JOIN
M_A A ON A.a_id = T.a_id
LEFT JOIN
M_B B ON B.b_id = A.b_id,
M_C C
WHERE
1=1
AND C.c_id IN (102, 106, 234, 868,319, 347, 215, 162, 297, 372, 630, 280, 347, 572, 321, 239, 252, 724, 233, 39968, 191)
AND T.c_id = C.c_id
AND T.DATETIME >= '2018-07-01 00:00:00.000'
AND T.DATETIME <= '2018-07-05 00:00:00.000'
AND B.b_id IN (100)
GROUP BY
C.c_id
Any good suggestions would be appreciated!
Upvotes: 1
Views: 60
Reputation: 5094
Relationship and data type seem ok.
M_B is the only table which is not related with T_D.
See if you have define FK constraint then no need to Join those table in query.
As per your visible query,main problem is your query
No need to use M_C C in query.
If possible use INNER JOIN or EXISTS clause.for example output from M_B and M_A is not require in resultset then you can try using EXISTS clause.
Since T_D has lot of record so put predicate in T_D columns which will improve cardinality estimate.
Notice : T.c_id IN (102,106,234,868,319,347,215,162,297,372,630,280,347,572,321,239,252,724,233,39968,191 )
SELECT
T.c_id ,SUM(T.DATA) DATA
FROM
T_D T
LEFT JOIN
M_A A
ON A.a_id = T.a_id
LEFT JOIN
M_B B
ON B.b_id = A.b_id
--, M_C C
WHERE
-- 1=1
T.c_id IN (102,106,234,868,319,347,215,162,297,372,630,280,347,572,321,239,252,724,233,39968,191 )
--AND T.c_id= C.c_id
AND T.DATETIME >= '2018-07-01 00:00:00.000'
AND T.DATETIME <= '2018-07-05 00:00:00.000'
AND B.b_id IN(
100
)
GROUP BY
T.c_id
Alternatively you can try
SELECT
T.c_id ,SUM(T.DATA) DATA
FROM
T_D T
where T.c_id IN (102,106,234,868,319,347,215,162,297,372,630,280,347,572,321,239,252,724,233,39968,191 )
AND T.DATETIME >= '2018-07-01 00:00:00.000'
AND T.DATETIME <= '2018-07-05 00:00:00.000'
And Exists(
select 1 from
M_A A
inner JOIN
M_B B
ON B.b_id = A.b_id
WHERE
A.a_id = T.a_id
AND B.b_id IN(
100
)
)
GROUP BY
T.c_id
Don't use M_C C
in join simple because it is not require.
There are several reason for "index not being use." But we can optimize query.
Upvotes: 1