Narcil
Narcil

Reputation: 355

mysql Alternatives to INTERSECT

I have been trying for the best part of the day to remove the intersect since it is not supported by mysql. If anyone can provide some pointers it would be really helpful.

SELECT *, DATE_FORMAT(NOW(), '%Y') - DATE_FORMAT(dob, '%Y') - (DATE_FORMAT(NOW(), '00-%m-%d') < DATE_FORMAT(dob, '00-%m-%d')) AS age 
FROM user U, user_utilisation UU 
WHERE U.id_user = UU.id_user AND cp >= 1 AND cp <= 3000 AND sexe = 'M' AND UU.id_mailing = 6 
GROUP BY U.id_user 
HAVING age >= 1 AND age <= 100 
ORDER BY nom, prenom

INTERSECT 

SELECT *, DATE_FORMAT(NOW(), '%Y') - DATE_FORMAT(dob, '%Y') - (DATE_FORMAT(NOW(), '00-%m-%d') < DATE_FORMAT(dob, '00-%m-%d')) AS age 
FROM user U, user_utilisation UU 
WHERE U.id_user = UU.id_user AND cp >= 1 AND cp <= 3000 AND sexe = 'M' AND UU.id_mailing = 7 
GROUP BY U.id_user 
HAVING age >= 1 AND age <= 100 
ORDER BY nom, prenom

I tried with JOIN(s) but here's what I have right now:

SELECT *, DATE_FORMAT(NOW(), '%Y') - DATE_FORMAT(naissance, '%Y') - (DATE_FORMAT(NOW(), '00-%m-%d') < DATE_FORMAT(naissance, '00-%m-%d')) AS age 
FROM user U, user_utilisation UU 
WHERE U.id_user = UU.id_user AND cp >= 1 AND cp <= 3000 AND sexe = 'M' 
    AND UU.id_user IN (select id_user from user_utilisation where id_mailing = 6 OR id_mailing = 7) 
HAVING age >= 1 AND age <= 100 
ORDER BY nom, prenom

but by removing the GROUP BY I see that the query selected 2 records where id_mailing = 1, while the GROUP BY is hiding the wrong record. I'm pretty sure that this could cause problems...

user_utilisation only has three fields id_user, id_mailing, and date.

Upvotes: 4

Views: 907

Answers (3)

Gordon Linoff
Gordon Linoff

Reputation: 1271003

I think you can simplify the query. It is has several curiosities. First, the query doesn't use proper join syntax. Second, the query is using the having clause just to filter on an alias from the select. I recommend a subquery in that case.

It looks like you are trying to get users who are in both "groups" as defined by two very similar queries (only the where clause is different). The following may be what you are looking for:

select t.*
from (SELECT *, DATE_FORMAT(NOW(), '%Y') - DATE_FORMAT(dob, '%Y') - (DATE_FORMAT(NOW(), '00-%m-%d') < DATE_FORMAT(dob, '00-%m-%d')) AS age,
             (case when cp >= 1 AND cp <= 3000 AND sexe = 'M' AND UU.id_mailing = 6 then 'FirstGroup'
                   when cp >= 1 AND cp <= 3000 AND sexe = 'M' AND UU.id_mailing = 7 then 'SecondGroup'
              end) as thegroup
      FROM user U join user_utilisation UU 
           on U.id_user = UU.id_user
     ) t
where thegroup is not null and age between 1 and 100
GROUP BY U.id_user
having max(thegroup) <> min(thegroup)
ORDER BY nom, prenom

The having clause is a short-hand way of saying that the user is in both groups.

Upvotes: 0

DWright
DWright

Reputation: 9500

Interpreting your queries, it seems to me:

  1. You want to get all users who participate in mailing_id 6 and 7 (that's why you are doing the INTERSECT, is that right?).
  2. You want to restrict those users by criteria and perform an age calculation.

If my interpretation of your intent is correct (not at all sure that is the case!), you don't need INTERSECT at all, you should be able to simply select the users you want with the criteria you want, and restrict participation in both mailing 6 and 7 via JOIN:

SELECT *, DATE_FORMAT(NOW(), '%Y') - DATE_FORMAT(dob, '%Y') - (DATE_FORMAT(NOW(), '00-%m-%d') < DATE_FORMAT(dob, '00-%m-%d')) AS age 
FROM user U
INNER JOIN user_utilisation UU on U.id_user = UU.id_user and UU.mailing_id = 6
INNER JOIN user_utilisation UU2 on U.id_user = UU2.id_user and UU2.mailing_id = 7
WHERE cp >= 1 AND cp <= 3000 AND sexe = 'M' 
HAVING age >= 1 AND age <= 100 
ORDER BY nom, prenom

Upvotes: 2

Mike Dinescu
Mike Dinescu

Reputation: 55760

You could try something like the following:

SELECT *
  FROM
      (SELECT *
            , DATE_FORMAT(NOW(), '%Y') - DATE_FORMAT(dob, '%Y') - (DATE_FORMAT(NOW(), '00-%m-%d') < DATE_FORMAT(dob, '00-%m-%d')) AS age 
        FROM user U
             INNER JOIN user_utilisation UU 
                     ON U.id_user = UU.id_user
        WHERE cp >= 1 AND cp <= 3000 
              AND sexe = 'M'
              AND UU.id_mailing = 6 
        GROUP BY U.id_user 
        HAVING age >= 1 AND age <= 100) QUERY1
 WHERE EXISTS(SELECT *
            , DATE_FORMAT(NOW(), '%Y') - DATE_FORMAT(dob, '%Y') - (DATE_FORMAT(NOW(), '00-%m-%d') < DATE_FORMAT(dob, '00-%m-%d')) AS age 
        FROM user U
             INNER JOIN user_utilisation UU 
                     ON U.id_user = UU.id_user
        WHERE cp >= 1 AND cp <= 3000 
              AND sexe = 'M'
              AND UU.id_mailing = 7
              AND U.id_user = QUERY1.id_user
        GROUP BY U.id_user 
        )
ORDER BY nom, prenom

The trick in the query above is that the WHERE EXISTS( ... ) clause acts as a filter, similar to the INTERSECT you were using before. The idea is that you will want to only select those records from the query QUERY1 that meet the criteria that there is at least one record in the query in the EXITS clause which is conditioned on the id_user being the same as the id_user from the QUERY1 query.. I hope this makes some sense..

I could probably tune it a little bit for you if you would provide the structure of the User table. And by the way, using those blanket SELECT * statements is not a good idea..

Upvotes: 1

Related Questions