Reputation: 4007

Is there a database design pattern name for reducing duplicate join table data?

I have two tables with a join table to allow a many-to-many relationship.

It's a very familiar design pattern. It indicates which Branches each Member has access to.

As the number of members and branches increases I end up with a lot of data in the join table that is duplicated across members. Members tend to have access to the same groups of Branches as other Members.

So I'm looking at normalizing my data by creating a MemberProfile table that is effectively immutable. And rather than creating MemberBranch records for every Member I check for a matching MemberProfile, use if it already exists, or create one if it doesn't:

The idea being if I have a million Members with only a hundred access profiles this will save me a lot of space in my database.

I'm happy that it all works and that the development effort is worth is.

My question is "Is this a standard database design pattern, and if so, what is it called?"

EDIT: It's been pointed out that this is compressing the data not normalizing it. Which is the intent behind the design.

Upvotes: 3

Answers (3)

Glenner003

Reputation: 1587

Hi I don't know about a pattern name but I've used the same principle before. To keep this performing well, introduce a checksum to memberProfile based upon the branches for the profile, this way a lookup for an existing profile is plain easy and fast.

But do remember that the checksum is not necessarily unique, in case of collisions you will still have to check the branches, but only for the profiles sharing the same checksum.

Cleanup can be a scheduled task is is nothing more then deleting the profiles without users.

Upvotes: 1

mbinette

Reputation: 5094

I understand you'd like to put a label on that precise transformation, but unfortunately, there aren't many books that discuss database design or refactoring patterns. One of the few is Martin Fowler's Refactoring Databases, which you may know for his work on analysis patterns (he also has a great blog, worth following!). In that book, Martin presents a bunch of refactoring patterns that can be applied to databases and has put a name on common database transformations, including the one you have presented, which he called Split Table.

Split Table. Vertically split (e.g. by columns) an existing table into one or more tables.

A catalog of the database refactorings presented in that book are available here.

Upvotes: 2

philipxy

Reputation: 15158

Unless your many:many table is always the join of particular other base tables, one is not normalizing. You aren't normalizing here. Normalization does not introduce new column names. It just rearranges the current ones among different base tables.

You are just compressing/encoding your data. There is not necessarily any benefit in this, since now some queries and updates will be slower although your database is smaller. (You have reported that it is worth it in your case.)

Upvotes: 5

Is there a database design pattern name for reducing duplicate join table data?

Answers (3)

Related Questions