Reputation: 501
I programmed a filter which generates a Query to show special employees.
I have table employees and a lot of 1:1, 1:n and n:m relationships e.g. for skills and languages for the employees like this:
Employees
id name
1 John
2 Mike
Skills
id skill experience
1 PHP 3
2 SQL 1
Employee_Skills
eid sid
1 1
1 2
Now I want to filter employees which have at least 2 years experience in using PHP and 1 year SQL.
My filter always generates a correct working Query for every table, relationship and field.
But now my problem is when I would like to filter the same field in a related table multiple times with a and it does not work.
e.g.
John PHP 3
John SQL 1
PHP and SQL are different rows so AND can not work.
I tried using group_concat and find_in_set but I have the problem that I can not filter experience over 2 years with find_in_set and find_in_set does not know PHP is 3 and SQL is 1.
I also tried
WHERE emp.id IN (SELECT eid FROM Employee_Skills WHERE sid IN (SELECT id FROM Skills WHERE skill = 'PHP' AND experience > 1)) AND emp.id IN (SELECT eid FROM Employee_Skills WHERE sid IN (SELECT id FROM Skills WHERE skill = 'SQL' AND experience > 0))
which works for this example, but it only works for n:m and it too complex to know the relationship type.
I have the final Query with
ski.skill = 'PHP' AND ski.experience > 1 AND ski.skill = 'SQL' AND ski.experience > 0
and I would like to manipulate the Query to make it work.
How does a Query have to look like to deal with relational division.
Upvotes: 1
Views: 120
Reputation: 57408
The straightforward way would be to repeatedly JOIN the skills:
SELECT e.*
FROM Employees AS e
JOIN Employee_Skills AS j1 ON (e.id = j1.eid)
JOIN Skills AS s1 ON (j1.sid = s1.id AND s1.skill = 'PHP' AND s1.experience > 3)
JOIN Employee_Skills AS j2 ON (e.id = j2.eid)
JOIN Skills AS s2 ON (j2.sid = s2.id AND s2.skill = 'SQL' AND s2.experience > 1)
...
Since all the clauses are required this translated to a straight JOIN.
You will need to add two JOINs for each clause, but they're quite fast joins.
A more hackish way would be to compress the skills into a code in a 1:1 relation with the employees. If experience never exceeds, say, 30, then you can multiply the first condition's experience by 1, the second by 30, the third by 30*30, the fourth by 30*30*30... and never get an overflow.
SELECT eid, SUM(CASE skill
WHEN 'PHP' THEN 30*experience
WHEN 'SQL' THEN 1*experience) AS code
FROM Employees_Skills JOIN Skills ON (Skills.id = Employees_Skills.sid)
GROUP BY eid HAVING code > 0;
Actually since you want 3 years PHP, you can HAVE code > 91. If you had three conditions with experiences 2, 3 and 5, you would request more than x = 2*30*30 + 3*30 + 5. This only serves to whittle the results, since 3*30*30 + 2*30 + 4 still passes the filter but is of no use to you. But since you want a restriction on code, and "> x" costs the same as "> 0" and gives better results... (if you needed more complex filtering than a series of AND, > 0 is safer, though).
The table above you join with Employees, then on the result you perform the true filtering, requiring
((code / 30*30) % 30) > 7 // for instance :-)
AND
((code / 30) % 30) > 3 // for PHP
AND
((code / 1) % 30) > 1 // for SQL
(the *1 and /1 are superfluous, and only inserted to clarify)
This solution requires a full table scan on Skills, with no real possibility of automatic optimizations. So it is slower than the other solution. On the other hand, its cost grows much more slowly, so if you have complex queries, or need OR operators or conditional expressions instead of ANDs, it may be more convenient to implement the "hackish" solution.
Upvotes: 0
Reputation: 350345
You could make a kind of pivot query, where you put the experience in each of all of the known skills in columns. This could be a long query, but you could build it dynamically in php, so it would add all skills as columns to the final query, which would look like this:
SELECT e.*, php_exp, sql_exp
FROM Employee e
INNER JOIN (
SELECT es.eid,
SUM(CASE s.skill WHEN 'PHP' THEN s.experience END) php_exp,
SUM(CASE s.skill WHEN 'SQL' THEN s.experience END) sql_exp,
SUM(CASE s.skill WHEN 'JS' THEN s.experience END) js_exp
-- do the same for other skills here --
FROM Employee_Skills es
INNER JOIN Skills s ON es.sid = s.id
GROUP BY es.eid
) pivot ON pivot.eid = e.id
WHERE php_exp > 2 AND sql_exp > 0;
The WHERE
clause is then very concise and intuitive: you use the logical operators like in other circumstances.
If the set of skills is rather static, you could even create a view for the sub-query. Then the final SQL is quite concise.
Here is a fiddle.
Alternative
Using the same principle, but using the SUM
in the HAVING
clause, you can avoid gathering all skill's experiences:
SELECT e.*
FROM Employee e
INNER JOIN (
SELECT es.eid
FROM Employee_Skills es
INNER JOIN Skills s ON es.sid = s.id
GROUP BY es.eid
HAVING SUM(CASE s.skill WHEN 'PHP' THEN s.experience END) > 2
AND SUM(CASE s.skill WHEN 'SQL' THEN s.experience END) > 0
) pivot ON pivot.eid = e.id;
Here is a fiddle.
You can also replace the CASE
construct by the IF
function, like this:
HAVING SUM(IF(s.skill='PHP', s.experience, 0)) > 2
... etc.
But it comes down to the same.
Upvotes: 0
Reputation: 24146
you can try next approach:
select * from Employees
where id in (
select eid
from Employee_Skills as a
inner join
Skills as ski
on (a.sid = ski.id)
where
(ski.skill = 'PHP' AND a.experience > 2) OR
(ski.skill = 'SQL' AND a.experience > 1)
group by eid
having count(*) = 2
)
so, for every filter you will add OR statement, having
will filter employees with all filters passed, just pass appropriate number
Upvotes: 1