Reputation: 1466
Suppose I want to create a simple database which lets user create playlists and add multiple songs into it. I just want to be able to find which songs are added in a particular playlist.
song table :
`song_id` INT AUTO_INCREMENT PRIMARY KEY, `song_title` VARCHAR
playlist table :
`playlist_id` INT AUTO_INCREMENT PRIMARY KEY, `playlist_title` VARCHAR
What would be the best option to pull this off?
playlist
table and insert comma separated ids of songs into that column. Which I don't think would be a proper relational way to do it but does the job.or
playlist_id INT, song_id INT
where both columns are foreign keys. Now, if the second option is better, should I add another column as a primary key and auto_increment knowing that it won't be useful anywhere? Because I read some articles online and many of them suggests that not having a primary key of a table significantly affects its performance in a negative way.
Upvotes: 3
Views: 568
Reputation: 29619
There are two aspects to your question - the abstract, philosophical view and practical implications.
Philosophically, the way we decide whether a database design is "good" is to see if it's normalized. You have two entities in your design - song and playlist. You have two relationships - a song can belong to 0..n play lists, and a play lists contain 0..n songs. You want to store those facts individually, rather than bundling them together. This means that the bridging table is "best" as it stores a single fact (song x belongs to playlist y), independently of the existence of song or playlist. The alternative design stores several facts in a single row - "a playlist exists, and has the following songs".
The second philosophical issue is "how do I uniquely identify facts?". In your bridging table, the unique fact is "song x belongs to playlist y". It can only belong to that playlist once (actually, that's probably not true - you may need a column to indicate in which order the song appears). This means you have a natural, compound key right there in your business domain. Philosophically, this is what you want to use to identify those records, so this should be your primary key.
From a practical point of view, the first question (option one or option two) depends on how your application will work and evolve. If you ever have to answer the question "in which playlists does this song appear", option 2 is much better - option one would require a where clause like 'where playlist.songs like '% songid,&', which will be very slow. If you ever have to delete a song, and make sure all the references are deleted too - option 2 is much better. Option one would be slow to find, and the code to update the comma-separated list would be horrible. If you ever have to insert songs in the middle of the play list, option 2 is much better.
As for the question "how do I assign my primary key" - i think you may have misunderstood the articles. A primary key is a logical concept, and doesn't need to be an auto-incrementing integer. As long as you have good indexes (and indexes are different to primary keys) your performance will be fine.
Upvotes: 2
Reputation: 26454
In InnoDB keep in mind that you access rows by traversing the primary key index in logical order, so you need to ask how you are looking up rows. Traversing an index is O(log(N)) complexity but if you are using a secondary index you are doing that twice.
Usually having a single column pkey in InnoDB is better, but there may be exceptions.
Upvotes: 1
Reputation: 1327
playlist_table
`playlist_id` INT AUTO_INCREMENT PRIMARY KEY, `playlist_title` VARCHAR
songs_table
`song_id` INT AUTO_INCREMENT PRIMARY KEY, `song_title` VARCHAR,playlist_id INT FOREIGN KEY (playlist_id) REFERENCES playlist_table(playlist_id)
When you want a search use join to find songs
select * from songs_table left join playlist_table on(songs_table.playlist_id=playlist_table.playlist_id)
Upvotes: -1
Reputation: 25
I would say option two will be the most beneficial to you. You would then have a table such as the following:
playlist_items table
pi_id
INT AUTO_INCREMENT PRIMARY KEY
pi_song_id
INT
pi_playlist_id
INT
With this you you could then add functionality in the future if required such as:
pi_dateadded
DATETIME
Upvotes: 1
Reputation: 21513
The 2nd option is FAR preferable.
As to the extra primary key, while not necessary I tend to use one even if just to make it easier to process rows from that table.
For example, say you want to delete a dozen rows you can use IN (comma separated list of ids) rather than a lot of where clauses checking each pair of fields in the rows.
As an aside, there are many reasons the 2nd option is preferable:-
Upvotes: 1
Reputation: 520948
You should strongly lean towards option two, namely creating a table which relates a playlist ID to a song ID. In this case, you can actually create a primary key which is a composite of the playlist and song ID.
CREATE TABLE playlist_songs (
song_id INT,
playlist_id INT,
PRIMARY KEY (song_id, playlist_id)
)
As to whether you also need an auto increment column on playlist_songs
, it would depend on your situation. You might not need it from a business logic point of view since you would likely be manipulating the table using the two columns already there.
Upvotes: 5