Reputation: 5091
I have a MySQL database with 4 tables:
job
job_application
client
candidate
Each table has it's own primary key, i.e job_id
, job_application_id
, client_id
, candidate_id
Employers in the client
table can post jobs in the job
table. The job
table contains a client_id
field which identifies the client
Candidates in the candidate
table can apply for a job, inserting a row in to the job_application
table. The job_application
table contains a job_id
field and a candidate_id
field to identify what the job is and who applied for it
I've run in to a bit of a problem writing up the queries for Employers to manage the job applications they receive. As an example here is a function I wrote that deletes rows from job_application
public function deleteJobApplications($job_application_ids) {
$this->db->query("DELETE ja.* FROM " . DB_PREFIX . "job_application ja LEFT JOIN " . DB_PREFIX . "job j ON (j.job_id = ja.job_id) WHERE ja.job_application_id IN ('" . implode("','", array_map('intval', $job_application_ids)) . "') AND j.client_id = '" . (int)$this->client->getClientId() . "'");
}
Because the client_id
is only referenced in the job
table, I need to LEFT JOIN
the job
table every time I want to UPDATE
or DELETE
from the job_application
table
Should I add another client_id
field to the job_application
table, essentially duplicating data already held in the database, or continue with the LEFT JOIN
for every UPDATE
and DELETE
?
Upvotes: 0
Views: 401
Reputation: 95532
Your problem isn't that you need to denormalize "job_applications" by introducing the "client_id" as a redundant column. (The currently accepted answer is factually incorrect in that regard.) Your problem is that you didn't normalize correctly in the first place. If you had, the column "client_id" would already be in that table, and your problem would never have arisen in the first place.
Let's pretend that candidate names, client names, and job names are globally unique.
A table that looks like this will satisfy the predicate Person named "candidate_name" applies for "job_name" at company "client_name".
job_applicatons
Person named <candidate_name> applies for <job_name> at company <client_name>.
client_name job_name candidate_name
--
Microsoft C++ programmer, Excel Ed Wood
Microsoft C++ programmer, Excel Dane Crute
Microsoft C++ programmer, Excel Vim Winder
Microsoft C++ programmer, Word Wil Krug
Microsoft C++ programmer, Word Val Stein
Google Python coder, search Ed Wood
Google Programmer, compilers Ed Wood
Google Programmer, compilers Val Stein
Three columns, no id numbers, no nulls, no nonprime attributes, all key. This relation is in 6NF.
It should be obvious that you could create a table for jobs (or job offers) by selecting distinct values from the first two columns. The foreign key reference is obvious.
jobs
Company named <client_name> offers <job_name>.
client_name job_name
--
Microsoft C++ programmer, Excel
Microsoft C++ programmer, Word
Google Python coder, search
Google Programmer, compilers
In a similar way, you can select distinct values from the first column alone for a set of companies, and from the last column alone for a set of applicants. Again, the foreign key references should be obvious.
clients
Company named <client_name> is a client.
client_name
--
Microsoft
Google
candidates
Person named <candidate_name> is looking for a job.
candidate_name
--
Ed Wood
Dane Crute
Vim Winder
Wil Krug
Val Stein
All those tables are in 6NF.
Augmenting a table with a surrogate key in addition to its natural keys doesn't change the normal form when you do it correctly. Let's replace the natural keys in "job_applications" with your surrogate ID numbers. Making that replacement will result in your table looking like this. (In practice, you'd do the same thing in the other tables, too.)
job_applications
--
client_id
job_id
candidate_id
primary key (client_id, job_id, candidate_id)
other columns go here...
Note that client_id is already in there. If there are no other columns, you're still in at least 5NF.
Upvotes: 2
Reputation: 438
To answer your question it depends your case, in particular size of tables if it is worth or not. This process is called denormalization. eg you can have info here : http://en.wikipedia.org/wiki/Denormalization
Upvotes: 0