SQL ManyToMany Design Pattern

Question

Suppose I have a MSSQL table for files and another for datetypes. Each file will have one created and modified date. A file may have 0 or more other dates of 15 different datetypes. A file can only have 1 date per datetype. Most files will only have 2 or 3 of these additional datetypes. All the dates must be stored in DATETIMEOFFSET types and the Files PK must be BIGINT type. The datetypes table's PK is TINYINT type. The last constraint is that I cant use uniqueidentifier columns (long story). Sorting speed and Storage space are both big considerations.

I am trying to figure out the best design pattern.

At first I only had 1 table. each row have columns for each kind of date. Except for created and modified, I used sparse columns for the other dates.
Next I had a table called dates that had a PK, date and datetypeID columns. I had another table called files_dates that had a column for FileID and another column for dateid. the primary key included both columns.
When I upserted a new file, i did not like have to use a join to check to see if an existing file had a certain date type.
Then I thought about moving the datetypeID column to the files_dates table. this made the queries easier but it smelled like bad design.
The last thing I thought of was to have only 2 tables: files and dates. the date table would have 3 columns: FileID, datetypeid and the date. the primary key for the dates table would be a combination of the FIleID and datetypeID columns. To my mind, this seems to best use of space because I didn't have to have a BIGINT column for dateid.
I could do a hybrid solution, using #5 and have created and modified dates as columns in the file table.
something else I haven't thought of.

I have not supported my constraints in this post but if you can trust me on those it would be good. I'm dealing with quintillions of files rows.

SQL ManyToMany Design Pattern

Answers (0)

Related Questions