Suppose I want to create a simple database which lets user create playlists and add multiple songs into it. I just want to be able to find which songs are added in a particular playlist. song table : `song_id` INT AUTO_INCREMENT PRIMARY KEY, `song_title` VARCHAR playlist table : `playlist_id` INT AUTO_INCREMENT PRIMARY KEY, `playlist_title` VARCHAR What would be the best option to pull this off? Add another column to the playlist table and insert comma separated ids of songs into that column. Which I don't think would be a proper relational way to do it but does the job. or Create separate table just to store song ids with the playlist id to which it belongs. Like playlist_id INT, song_id INT where both columns are foreign keys. Now, if the second option is better, should I add another column as a primary key and auto_increment knowing that it won't be useful anywhere? Because I read some articles online and many of them suggests that not having a primary key of a table significantly affects its performance in a negative way.

Reputation: 1466

When is it okay to not to use PRIMARY KEY?

Suppose I want to create a simple database which lets user create playlists and add multiple songs into it. I just want to be able to find which songs are added in a particular playlist.

song table :

`song_id` INT AUTO_INCREMENT PRIMARY KEY, `song_title` VARCHAR

playlist table :

`playlist_id` INT AUTO_INCREMENT PRIMARY KEY, `playlist_title` VARCHAR

What would be the best option to pull this off?

Add another column to the playlist table and insert comma separated ids of songs into that column. Which I don't think would be a proper relational way to do it but does the job.

Create separate table just to store song ids with the playlist id to which it belongs. Like playlist_id INT, song_id INT where both columns are foreign keys.

Now, if the second option is better, should I add another column as a primary key and auto_increment knowing that it won't be useful anywhere? Because I read some articles online and many of them suggests that not having a primary key of a table significantly affects its performance in a negative way.

Upvotes: 3

Answers (6)

Neville Kuyt

Reputation: 29619

There are two aspects to your question - the abstract, philosophical view and practical implications.

Philosophically, the way we decide whether a database design is "good" is to see if it's normalized. You have two entities in your design - song and playlist. You have two relationships - a song can belong to 0..n play lists, and a play lists contain 0..n songs. You want to store those facts individually, rather than bundling them together. This means that the bridging table is "best" as it stores a single fact (song x belongs to playlist y), independently of the existence of song or playlist. The alternative design stores several facts in a single row - "a playlist exists, and has the following songs".

The second philosophical issue is "how do I uniquely identify facts?". In your bridging table, the unique fact is "song x belongs to playlist y". It can only belong to that playlist once (actually, that's probably not true - you may need a column to indicate in which order the song appears). This means you have a natural, compound key right there in your business domain. Philosophically, this is what you want to use to identify those records, so this should be your primary key.

From a practical point of view, the first question (option one or option two) depends on how your application will work and evolve. If you ever have to answer the question "in which playlists does this song appear", option 2 is much better - option one would require a where clause like 'where playlist.songs like '% songid,&', which will be very slow. If you ever have to delete a song, and make sure all the references are deleted too - option 2 is much better. Option one would be slow to find, and the code to update the comma-separated list would be horrible. If you ever have to insert songs in the middle of the play list, option 2 is much better.

As for the question "how do I assign my primary key" - i think you may have misunderstood the articles. A primary key is a logical concept, and doesn't need to be an auto-incrementing integer. As long as you have good indexes (and indexes are different to primary keys) your performance will be fine.

Upvotes: 2

Chris Travers

Reputation: 26454

In InnoDB keep in mind that you access rows by traversing the primary key index in logical order, so you need to ask how you are looking up rows. Traversing an index is O(log(N)) complexity but if you are using a secondary index you are doing that twice.

Usually having a single column pkey in InnoDB is better, but there may be exceptions.

Upvotes: 1

Sanooj T

Reputation: 1327

playlist_table

`playlist_id` INT AUTO_INCREMENT PRIMARY KEY, `playlist_title` VARCHAR

songs_table

`song_id` INT AUTO_INCREMENT PRIMARY KEY, `song_title` VARCHAR,playlist_id INT FOREIGN KEY (playlist_id) REFERENCES playlist_table(playlist_id)

When you want a search use join to find songs

select * from songs_table left join playlist_table on(songs_table.playlist_id=playlist_table.playlist_id)

Upvotes: -1

JamieP

Reputation: 25

I would say option two will be the most beneficial to you. You would then have a table such as the following:

playlist_items table

pi_id INT AUTO_INCREMENT PRIMARY KEY

pi_song_id INT

pi_playlist_id INT

With this you you could then add functionality in the future if required such as:

pi_dateadded DATETIME

Upvotes: 1

Kickstart

Reputation: 21513

The 2nd option is FAR preferable.

As to the extra primary key, while not necessary I tend to use one even if just to make it easier to process rows from that table.

For example, say you want to delete a dozen rows you can use IN (comma separated list of ids) rather than a lot of where clauses checking each pair of fields in the rows.

As an aside, there are many reasons the 2nd option is preferable:-

What happens when you want more items in the comma separated list than will fit in the field?
What happens when you want to search for a value in that list? You cannot index for a value half way through that list.
What happens when you want to hang another value from the item on the play list? For example the number of times that track has been played on that playlist?
Etc

Upvotes: 1

Tim Biegeleisen

Reputation: 520948

You should strongly lean towards option two, namely creating a table which relates a playlist ID to a song ID. In this case, you can actually create a primary key which is a composite of the playlist and song ID.

CREATE TABLE playlist_songs (
    song_id INT,
    playlist_id INT,
    PRIMARY KEY (song_id, playlist_id)
)

As to whether you also need an auto increment column on playlist_songs, it would depend on your situation. You might not need it from a business logic point of view since you would likely be manipulating the table using the two columns already there.

Upvotes: 5

When is it okay to not to use PRIMARY KEY?

Answers (6)

Related Questions