ACobbs
ACobbs

Reputation: 364

Database normalization: Using separate tables to store a single field

Currently our database is set up so that a payment transactions records a payment type ID, and this links to a payment type (cash, check, credit) table that contains these values. Example:

Payment Transaction:

  • ID
  • Amount
  • Date
  • Payment Type ID

Payment Type:

  • ID
  • Payment Type (Cash, Credit)

My question is whether or not I should just remove the payment type table, and just store the payment type value as text inside the payment transaction.

This is similar to this question. except with payment types it's pretty certain that no new information will ever need to be add data per payment type. 'Cash' doesn't link to anything, there's nothing I need to know about Cash itself, it just is.

As far as I can tell the pros and cons would of replacing the payment type table with a single field would be:

Pros

Cons

I'm leaning towards removing the payment type table and adding the single field to the payment transaction table, what do you recommend would be the best course of action?

Upvotes: 0

Views: 1432

Answers (2)

Jonathan Leffler
Jonathan Leffler

Reputation: 754860

If you remove the PaymentType table, you replace a foreign key check with a table CHECK constraint:

PaymentType   CHAR(6) NOT NULL CHECK(PaymentType IN('Cash', 'Credit', 'Cheque')

OK — you write 'cheque' as 'check'; just another difference between English and American.

Now, this makes it much more difficult to find out what the possible values are; you have to analyze the system catalog to find out. With the separate table, you can examine the separate table to find out what's allowed. Suppose you start tracking 'Debit' separately from 'Credit'; you add a row to a table, vs a table schema change. Suppose you decide you need to record which codes are allowed in future transactions (so 'Cash' ceases to be an option). You can add a column to the Payment Type table to indicate that this code is no longer valid; it is much harder to do that with a simple CHECK constraint.

So, even though you currently have limited or no extra data in the Payment Type table, I would use the Payment Type table rather than embedding the payment type in the Payment Transaction table.

Were it my design, though, I'd probably use a CHAR(1) or CHAR(2) code as the identifier for the payment type, rather than a numeric column. Granted, all three types start with 'C', so maybe you'd use 'A' for cAsh, 'H' for cHeck, and 'R' for cRedit (and maybe 'D' or 'E' for Debit or dEbit) with a CHAR(1) code; with CHAR(2), you'd use 'CA', 'CH', 'CR' (and maybe 'DE'). The full name can be stored in the payment type table for use in reports. In this case, the benefits are not huge, but saving 4 bytes per record on enough records (a large enough number of small enough records) and it can become a factor in your storage costs. Of course, index overhead also comes into play; if the column in the Payment Transaction table must be indexed, then the smaller field uses less index space.

Upvotes: 1

duffymo
duffymo

Reputation: 308998

I don't agree with either of your pro arguments.

Removes a mostly unnecessary join whenever the payment type needs to be found.

There's just your assumption that this will be a performance bottleneck. Denormalization is something you should do when you have data that says you must. This isn't one of those times.

The payment type for a transaction will always accurately reflect what it was at the time the transaction was recorded. i.e. If I change the 'Cash' record in the payment types table to 'Credit' (for whatever reason), all payment transactions that link to Cash will now be linked to Credit.

You should not allow someone to modify the payment type this way. Changing the payment type should be another transaction, with its own timestamp.

Any relational database can handle the JOIN and the normalized tables. You're guilty of premature optimization, I fear.

I'd spend less time worrying about this and more time thinking about how you'll deal with history. How long will you keep transactions around before moving them out to a history table? Have you thought about partitioning your database by months according to timestamp? That would be more worthy of your efforts.

Upvotes: 2

Related Questions