Martin AJ
Martin AJ

Reputation: 6697

How can I restrict the the result based on the values in the select statement?

I have a query like this:

SELECT u.id, u.name, top_tags(u.id) as top_tags, cal_rep(u.id) rep
FROM users 
ORDER BY rep
LIMIT :p,20;

top_tags() is a function that returns either NULL or a comma separated string of user's top three tags. Also cal_rep() is another function which returns either 0 or the number of reputation. So query above always returns a list of all users. Something like this:

+----+----------+-------------------------------+-----------+
| id |   name   |            top_tags           |    rep    |
+----+----------+-------------------------------+-----------+
| 1  | Jack     | HTML,CSS,jQuery               | 3244      |
| 2  | Peter    | SQL-Server,MySQL,Database     | 543       |
| 3  | Martin   | NULL                          | 0         |
+----+----------+-------------------------------+-----------+    

While I want to omit the users who have NULL as their top_tags and 0 reputation. So this is the expected result:

+----+----------+-------------------------------+-----------+
| id |   name   |            top_tags           |    rep    |
+----+----------+-------------------------------+-----------+
| 1  | Jack     | HTML,CSS,jQuery               | 3244      |
| 2  | Peter    | SQL-Server,MySQL,Database     | 543       |
+----+----------+-------------------------------+-----------+    

How can I do that?

Upvotes: 0

Views: 338

Answers (3)

Paul Spiegel
Paul Spiegel

Reputation: 31772

You can use the same expressions in the WHERE clause.

SELECT u.id, u.name, top_tags(u.id) as top_tags, cal_rep(u.id) rep
FROM users 
WHERE top_tags(u.id) IS NOT NULL
  AND cal_rep(u.id) <> 0
ORDER BY rep
LIMIT :p,20;

Not sure if you want AND or OR.

If you don't want to repeat the expressions from the SELECT clause, you can use their aliases in the HAVING clause.

SELECT u.id, u.name, top_tags(u.id) as top_tags, cal_rep(u.id) rep
FROM users 
HAVING top_tags IS NOT NULL
   AND rep <> 0
ORDER BY rep
LIMIT :p,20;

Regarding the performance: On a table with a sequence of 1 million integers i have tested the following queries (MySQL 5.7.18):

select i
from helper.seq
-- 0.265 sec

select length(sha2(sha2(i, 512), 512)) as l
from helper.seq
-- 1.843 sec

select length(sha2(sha2(i, 512), 512)) as l
from helper.seq
where length(sha2(sha2(i, 512), 512)) = 128
-- 3.437 sec

select length(sha2(sha2(i, 512), 512)) as l
from helper.seq
having l = 128
-- 3.531 sec

select *
from (
    select length(sha2(sha2(i, 512), 512)) as l
    from helper.seq
) sub
where l = 128
-- 3.547 sec

Comparing the first two queries, we can see that the nested function call has an overhead of ~ 1.6 seconds. For the other three queries we see the same overhead doubled. I don't know what's going on behind the scenes, but it looks like MySQL will execute the functions twice regardless of the method.

Upvotes: 1

Santo
Santo

Reputation: 69

select *
from (
  SELECT u.id, u.name, top_tags(u.id) as top_tags, cal_rep(u.id) rep
  FROM users 
  LIMIT :p,20
) a
where a.top_tags is not null 
and   a.rep != 0
ORDER BY a.rep;

In this way, functions are executed only one time.

Upvotes: 0

Sal
Sal

Reputation: 1317

Use the WHERE clause and is null operator.

Upvotes: 1

Related Questions