Insert one column DISTINCT, with corresponding other columns from one table to another

Question

I have a problem with database query. I have three tables projects, developers and email. In developers table, there are a lot of rows with same name but different email. I have to insert the distinct names but all the emails(in the row of name to which they belong) in email table i.e

    example
    /////////////////////////////////////////////
     developers table have records:-
     id_developer  project_id  name  email
          0            1       umar  umar@gmail.com
          1            1       umar  umar@developers.com

     Now i want to inert the data in email table as:-
     user_id    name      email_ids
        0       umar      umar@gmail.com
                          umar@developers.com
    ////////////////////////////////////////////

    projects
    ----------
    id_project
    name
    ----------

    developers
    ----------
    id_developer
    project_id
    name
    email
    ----------

    email
    ----------
    user_id
    name
    email_ids
    ----------

Following is my current query. Please help me. Thanks in advance

    INSERT INTO email(user_id, dev_name, email_ids) 
    SELECT p.id_project, 
           d.name, 
           d.email 
      FROM projects p 
            INNER JOIN developers AS d 
            ON p.id_project = d.project_id 
     WHERE d.name IN (SELECT name 
                        FROM developers 
                       GROUP BY name HAVING(COUNT(name) > 1 )) 
     GROUP BY(d.name)

Jorge Campos · Accepted Answer

After some conversation in the comments what you really need is a proper data modeling here.

Having the data the way you wan't in the database is a very bad practice.

  user_id    name      email_ids
    0       umar      umar@gmail.com
                      umar@developers.com

You will end it up having problems in the future to retrieves this data because you will have to figure out a way how to retrieve or split this data when you need then.

So, based on your current model to attend your requirement you would need just to change the table email a bit. Your model would be this way:

projects       developers        email
----------     -------------     ------------
id_project     id_developer      id
name           project_id        id_developer
               name              email
----------     -------------     ------------

So, since you already have the data in the developers table lets first drop table table email and recreate it the right way. You will need to execute:

drop table email;
create table dev_email( -- changed the name because there is a field with same name
    id INTEGER AUTO_INCREMENT NOT NULL,
    id_developer INTEGER NOT NULL, -- this column should be the same type 
                                   -- as id_developer in the table developers
    email VARCHAR(150) NOT NULL
    PRIMARY KEY pk_email (id),
    CONSTRAINT uk_developer_email UNIQUE (id_developer, email), -- that will avoid duplicates,
    CONSTRAINT fk_dev FOREIGN KEY (developer_id) 
         REFERENCES developers(id_developer)
          ON UPDATE RESTRICT ON DELETE RESTRICT
);

Now lets fill this table with the right data:

INSERT INTO dev_email (id_developer, email)
   SELECT min(id_developer), email
     FROM developers
    GROUP BY email;

After that we must delete the duplicated data from the developers table like so:

DELETE FROM developers d
  WHERE NOT EXIST (SELECT 1 
                     FROM dev_email de 
                    WHERE de.id_developer = d.id_developer);

Then we drop the column that is no longer needed in the developers table:

ALTER TABLE developers DROP COLUMN email;

This should give you a proper normalized model.

Now if you need to retrieve the developer with all emails concatenated (which is simpler than to split it) you just do:

 SELECT d.id_developer, 
        d.name,
        GROUP_CONCAT(e.email, ', ') as emails
   FROM developers d
            INNER JOIN dev_email e
               ON d.id_developer = e.id_developer
  GROUP BY d.id_developer, 
           d.name

PS.: I did all of this out of my head, please run it in a test environment first (a copy of your current database to be safe). It should be ok but better safe than sorry right?

Insert one column DISTINCT, with corresponding other columns from one table to another

Answers (1)

Related Questions