Harsh Joshi
Harsh Joshi

Reputation: 33

Looking for performance improvements in the SQL

Scenario: There are 2 columns in the table with data as given in the sample below. It is possible that the table has multiple rows for the same value of 'a' column.

In the example, Considering the 'a' column, There are three rows for '1' and one row for '2'.

Sample table 't1':

|a|b  |
|1|1.1|
|1|1.2|
|1|2.2|
|2|3.1|

Requirement is to get following output:

Expected Query output:

|a|b  |
|1|1.2|
|2|3.1|

Requirement:

Query I used:

select distinct min(a) over(partition by table1.a) as a,
min(b) over(partition by table1.a) as b
from (
    SELECT  distinct Min(table2.a) OVER (PARTITION BY table2.a) AS a, 
    Max(table2.b) OVER (PARTITION BY table2.a) AS b
    FROM t1 table2
  union
    SELECT  distinct Min(table3.a) OVER (PARTITION BY table3.a) AS a, 
    Max(table3.b) OVER (PARTITION BY table3.a) AS b
    FROM t1 table3 
    where table3.a = FLOOR(table3.b)
) table1;

This query is working and I am getting the desired output. Looking for inputs to improve by removing union and extra select from the above script.

Note: t1 is not a table but it's a procedure call in my case and there are additional columns that it returns. It would help if the extra call to the procedure can be avoided.

Upvotes: 3

Views: 64

Answers (1)

ekochergin
ekochergin

Reputation: 4129

This is how I would get the data you need.

select t1.a, max(t1.b) 
  from (select a, b, count(1) over(partition by t1.a) cnt from t1) t1
 where t1.a = floor(t1.b) or cnt = 1
 group by t1.a ,cnt;

It has only one procedure call so it might run significantly faster

And please note that "union" clause not only appends two data sets, but removes duplicates as well. Removing duplicates causes additional checks between data sets and therefore is leading to performance issues.

It is in most cases better to use "union all" which doesn't check for duplicates

Upvotes: 1

Related Questions