Reputation: 173
I have a table with huge row count in mysql (though I am looking for generic SQL solution)
very_big_table(INT a, INT b, INT c, ...)
I wanted SELECT statement
SELECT a,
(b + c) as expression1,
(b + c + a) AS expression2 -- basically (expression1 + a)
FROM very_big_table
WHERE ...
GROUP BY a
ORDER BY a DESC
this looks good and easily readable as long as the expression1 is simple.
But when CASE-WHEN/IFNULL()/SUM()/MIN()/STRCAT() or some Operator comes into play in these expressions its difficult to read and debug.
I have gone through some of the already asked questions
assigning mysql value to variable inline
Use value of a column for another column (SQL Server)?
How to use conditional columns values in the same select statement?
But if I use the approaches described something like
SELECT a,
expression1,
(expression1 + a) AS expression2
FROM
(SELECT a,
(b + c) AS expression1
FROM very_big_table
WHERE ...
GROUP BY a) as inner_table
ORDER BY a DESC
this works fine, but this query is taking some 70x more time to execute. Atleast when i fired it, though only once.
what if I have multiple levels of the expressions in the output columns?
Is there any elegant way to deal with this, without compromising readability?
BTW why isnt this expression reuse or alias reference in select statement not supported by SQL standards or vendors? (supposing there are no cyclic evaluation in the single SELECT statement expressions. in that case the compiler fails)
Upvotes: 8
Views: 15969
Reputation: 39423
For Postgresql users
If you want to maximize readibility of the complex computation and/or logic, encapsulate the computation in its own function
create table x
(
a int not null,
b int not null,
c int not null
);
create table y
(
a int not null,
z int not null
);
select * from x;
insert into x VALUES
(1,2,3), (4,5,6);
insert into y values
(1, 100);
create or replace function computer
(
rx x, -- this emphasizes that the source of data come from table x
ry y, -- this emphasizes that the source of data come from table x
out expression1 int, out expression2 int, out expression3 int
)
as $$
begin
expression1 := rx.b + rx.c;
expression2 := expression1 + rx.a;
expression3 := expression2 + ry.z;
end;
$$ language 'plpgsql';
Test query:
select x.a, x.b, x.c, y.z, answer.*
from x
join y on x.a = y.a
cross join lateral computer(x,y) answer
Output:
If you just need to quickly test the function's returned values, it can be expanded by using parenthesis-asterisk syntax:
select x.a, x.b, x.c, y.z, (computer(x, y)).*
from x
join y on x.a = y.a
Don't use that in production though, if the function has 50 returned columns, the function will be called 50 times. Use cross join lateral
in production. See: How can you expand a "condensed" PostgreSQL row into separate columns?
If your function don't need to concern itself where the data source come from, just use the record
type on function's parameter
create or replace function computer_b
(
anon record,
out expression1 int, out expression2 int, out expression3 int
)
as $$
begin
expression1 := anon.b + anon.c;
expression2 := expression1 + anon.a;
expression3 := expression2 + anon.z;
end;
$$ language 'plpgsql';
Test query:
with data_source as
(
select x.*, y.z
from x
join y on x.a = y.a
)
select ds.*, answer.*
from data_source ds
cross join lateral computer_b(ds) answer
Upvotes: 0
Reputation: 404
You can use User-Defined Variable to solve your problem. Your SQL can be rewritten as:
SELECT a,
@expr1 := (b + c) as expression1,
(@expr1 + a) AS expression2
FROM very_big_table
WHERE ...
GROUP BY a
ORDER BY a DESC
You can refer to this post.
Upvotes: 8
Reputation: 1028
I have 2 choices for that..
using your actual query as a subquery
select a,exp1,exp1+a as exp2
from (SELECT a
,(b + c) as exp1
FROM very_big_table
WHERE ...
GROUP BY a
)V
ORDER BY a DESC
or adding a outer apply statment for your query...
SELECT a
,OA.exp1 as expression1
,(OA.exp1 + a) AS expression2 -- basically (expression1 + a)
FROM very_big_table
outer apply (select (b + c) as exp1) OA
WHERE ...
GROUP BY a
ORDER BY a DESC
I guess the second option, using outer apply, is better for read...
keep in mind that Outer apply runs for each row
so it may be a bad idea if the exp1 have to access large amounth of data from tables..
anyway, just using fields that you are already getting in your actual query you will not have a great cost by adding it.
so.. what way you would choose?
Upvotes: 2
Reputation: 1870
workaround assuming that the expressions are not sql select sub-clauses, but rather nested operators/functions purely acting on data retrieved in the original "select from" clause so the task at hand is purely and simply to make the code "look pretty": create user-defined-functions
then your query would look like
select a
, myfunction( b, c)
, myfunction( b, c) + a
...
upside -this cleans up the "select code" which was your stated goal, also you can centrally manage more complex logic
downside - this is not "generic sql" that will port to other systems and maybe not worth the investment in time if the functions will rarely be re-used
Upvotes: 0