rafek
rafek

Reputation: 5480

Many-to-many relationship structure in SQL Server with or without extra primary key column?

Assume that we have two tables: Roles and Reports. And there exists a many-to-many relationship between them. Of course, the only solution that comes to my mind is to create a cross-table, let's name it RoleReport. I can see two approaches to the structure of that table:

1. Columns: RoleReportId, RoleId, ReportId
   PK: RoleReportId
2. Columns: RoleId, ReportId
   PK: RoleId, ReportId

Is there any real difference between them (performance or whatever else)?

Upvotes: 2

Views: 2695

Answers (7)

Quassnoi
Quassnoi

Reputation: 425813

You will need a composite UNIQUE index on (RoleId, ReportId) anyway.

There is no point in not doing it a PRIMARY KEY.

If you do it a CLUSTERED PRIMARY KEY (which is default), this will be better performance-wise, since it will be less in size.

A clustered primary key will contain only two columns in each record: RoleID and ReportID, while a secondary index will contain three columns: RoleID, ReportID and RoleReportID (as a row pointer).

You may want to create an additional index on ReportID which may be used to search all Roles for a given Report.

There would be some point in making a surrogate key for this relationship if the two following conditions held:

  1. You have additional attributes in your relationship (i. e. this table contains additional columns, like Date or anything else)
    • You have lots of tables that refer to this relationship with a FOREIGN KEY

In this case it would be nicer to have a single-column PRIMARY KEY to refer to in FOREIGN KEY relationships.

Since you don't seem to have this need, just make a composite PRIMARY KEY.

Upvotes: 10

Larry Lustig
Larry Lustig

Reputation: 50998

The benefit of using RoleReportID as a single-column primary key comes when you (or the other guy, depending on the structure of your company) need to write a front end that addresses individual role<->report relationships (for instance, to delete one). At that point, you may prefer the fact that you need to address only one column, instead of two, to identify the linking record.

Other than that, you don't need the RoleReportID column.

Upvotes: 0

Thomas Weller
Thomas Weller

Reputation: 11717

If you have many rows, then it might be beneficial to have appropriately ordered indexes on your RoleId and/or ReportId columns, since this will speed up look up operations - but inversely this will slow down insert/delete operations. This is a classic usage profile issue...

If not required otherwise, omit the RoleReportId PK. It adds nothing to the relationship, forces the Server to generate a useless number on each insert, and leaves the other two columns unordered, which slows down lookups.

But all in all, we are talking about milliseconds here. This only becomes relevant, if there is a huge amount of data (say more than 10.000 rows)...

Upvotes: 1

user186336
user186336

Reputation:

I would suggest du choose no PK for your second choice. You may use indices or an unique constraint over the combination of both columns.

Upvotes: 0

Greg D
Greg D

Reputation: 44096

Semantically, the difference is what you're using as the primary key.

Typically I let the remainder of my schema dictate what I do in this situation. If the cross-table is exclusively the implementation of the many-to-many relationship, I tend to use the concatenated primary key. If I'm hanging more information off the cross table, making it an entity in its own right, I'm more inclined to give it its own id independent of the two tables it's connecting.

This is, of course, subjective. I don't claim that this is the One True Way (tm).

Upvotes: 2

Martin B
Martin B

Reputation: 24180

Unless you really need the RoleReportId as a foreign key in some other table (which is not usually going to be the case), go with option 2. It's going to require less storage, and that by itself will probably give a performance advantage -- plus why have a column you're never going to use?

Upvotes: 2

paxdiablo
paxdiablo

Reputation: 882606

You don't actually need the RoleReportId. It adds nothing to the relationship.

Many people try to avoid using a naturally-unique key in real tables, instead opting for an artificially unique one, but I don't always agree with that. For example, if you can be sure that your SSN will never change, you can use that as a key. If it somehow does change in the future, you can fix it then.

But I don't intend arguing that point, there's good arguments on both sides. However, you certainly don't need an artificially unique key in this case since both your other fields are, and will remain, unique.

Upvotes: 5

Related Questions