Given employees and departments tables find missing department id

Question

I am given a table of employees and a table of departments. The table "Employees" has missing manager id and missing department id. The structure of tables is in the fiddle below. I am asked to find those missing department ids. From what I see, one manager can work with several departments. There are no further assumptions. This is what I have tried. But it is not correct. Is it possible to solve without having more information about the problem? I assume also no hardcoding of missing data.

select 
e.name as emp_name,
d.name as dep_name,
Dep_id, Manager_id, 
Salary
from Employees as e 
left join departments as d
on d.id=e.dep_id
order by Salary  desc;

http://sqlfiddle.com/#!9/f8fa05

CREATE TABLE IF NOT EXISTS `departments` (
  `id` int(6) unsigned NOT NULL,
  `name` varchar(200),
  PRIMARY KEY (`id`)
) DEFAULT CHARSET=utf8;
INSERT INTO `departments` (`id`, `name`) VALUES
  ('1', 'Finance'),
  ('2', 'Operations'),
  ('3', 'Deployment');
CREATE TABLE IF NOT EXISTS `employees` (
  `id` int(6) unsigned NOT NULL,
  `name` varchar(200),
  `Dep_id` int(6) unsigned ,
  `Manager_id` int(6) unsigned,
  `Salary` int ,

  PRIMARY KEY (`id`)
) DEFAULT CHARSET=utf8;
INSERT INTO `employees` (`id`, `name`, `Dep_id`, `Manager_id`, `Salary`) VALUES
  ('1', 'John Smith',1,NULL,2000 ),
  ('2', 'Jack Smith',NULL,1, 1500),
  ('3', 'Becky Smith',1,2,2000),
  ('4', 'Rebecca Smith',2,2,700),
  ('5', 'Sonny Smith',3,1,3000);

forpas · Accepted Answer

The only way that I could think of is getting dep_id and manager_id from other employees who have the same not null manager_id and dep_id respectively.
This does not give a unique solution if as you say: one manager can work with several departments, this is why I used aggregation to pick one of them:

select 
  e.id, e.name, 
  max(coalesce(e.dep_id, ee.dep_id)) dep_id, 
  max(coalesce(e.manager_id, ee.manager_id)) manager_id, 
  e.salary
from employees e 
left join employees ee on ee.dep_id = e.dep_id or ee.manager_id = e.manager_id
group by e.id, e.name, e.salary
order by e.id

If the table is large maybe it would be more efficient to split the query in 2 parts (to avoid joins and aggregation for all the rows) and then use UNION ALL for the final results:

select e.* from employees e 
where e.dep_id is not null and e.manager_id is not null
union all
select 
  e.id, e.name, 
  max(coalesce(e.dep_id, ee.dep_id)) dep_id, 
  max(coalesce(e.manager_id, ee.manager_id)) manager_id, 
  e.salary
from employees e 
left join employees ee on ee.dep_id = e.dep_id or ee.manager_id = e.manager_id
where e.dep_id is null or e.manager_id is null
group by e.id, e.name, e.salary
order by id

See the demo.
Results:

id | name          | dep_id | manager_id | salary
-: | :------------ | -----: | ---------: | -----:
 1 | John Smith    |      1 |          2 |   2000
 2 | Jack Smith    |      3 |          1 |   1500
 3 | Becky Smith   |      1 |          2 |   2000
 4 | Rebecca Smith |      2 |          2 |    700
 5 | Sonny Smith   |      3 |          1 |   3000

So 'John Smith' gets manager_id = 2 because 'Becky Smith' is in the same department and has this manager_id.
Also 'Jack Smith' gets dep_id = 3 because 'Sonny Smith' has the same manager with that dep_id.

Given employees and departments tables find missing department id

Answers (1)

Related Questions