6ton
6ton

Reputation: 4214

Use the values of a column (instead of numeric ids) as foreign key reference

Was not sure how to express this in the title. So here's the deal: I have a table storing information about currency pairs used in foreign exchange rates:

PAIR_ID | BASE_CURRENCY | TERM_CURRENCY | ATTRIBUTE1 | ATTRIBUTE2 ...

Ideally I should have another table to store the currency symbols (master data), say CURRENCY_SYMBOLS and foreign keys from BASE_CURRENCY and TERM_CURRENCY to this table. However I am confused about 2 possible approaches here.

Approach 1:

CURRENCY_PAIRS:

PAIR_ID | BASE_CURRENCY_ID | TERM_CURRENCY_ID | ATTRIBUTE1 | ATTRIBUTE2 ...

CURRENCY_SYMBOLS:

SYMBOL_ID | SYMBOL

with BASE_CURRENCY_ID & TERM_CURRENCY_ID referencing SYMBOL_ID

Or Approach 2: rather than having a symbol_id that really adds no value, just have:

CURRENCY_PAIRS:

PAIR_ID | BASE_CURRENCY | TERM_CURRENCY | ATTRIBUTE1 | ATTRIBUTE2 ...

CURRENCY_SYMBOLS:

SYMBOL

with BASE_CURRENCY & TERM_CURRENCY referencing the SYMBOL directly.

I am not sure which one is better, approach 1 seems ideal but really no advantage - in fact in all my queries an additional join will be needed to retrieve data.

Approach 2 seems more efficient but somehow not correct.

Any pointers on which one I should go with?

Upvotes: 0

Views: 173

Answers (3)

Alvin Thompson
Alvin Thompson

Reputation: 5448

Approach 2 seems like a good idea at first, but there are a few problems with it. I'll list them all even though 1 and 2 don't really apply as much to you, since you're only using it with 3-digit ISO codes:

  1. Foreign key references can take up more room. Depending on how long you need to make your VARCHARs, they can take up more room as foreign keys then, say, a byte or a short. If you have zillions of objects which refer to these foreign keys then it adds up. Some DBs are smart about this and replace the VARCHARs with hash table references in the referring tables, but some don't. No DB is smart about it 100% of the time.
  2. You're necessarily exposing database keys (which should have no meaning, at least to end-users) as business keys. What if the bosses want to replace "USD" with "$" or "Dollars"? You would need to add a lookup table in that case, negating a primary reason to use this approach in the first place. Otherwise you'd need to change the value in the CURRENCY_SYMBOLS, which can be tricky (See #3).
  3. It's hard to maintain. Countries occasionally change. They change currencies as they enter/leave the Euro, have coups, etc. Sometimes just the name of the currency becomes politically incorrect. With this approach you not only would have to change the entry in CURRENCY_SYMBOLS, but cascade that change to every object in the DB that refers to it. That could be incredibly slow. Also, since you have no constant keys, the keys the programmers are hard-wiring into their business logic are these same keys that have now changed. Good luck hunting through the entire code base to find them all.

I often use a "hybrid" approach; that is, I use approach 1 but with a very short VARCHAR as the ID (3 or 4 characters max). That way, each entry can have a "SYMBOL" field which is exposed to end users and can be changed as needed by simply modifying the one table entry. Also, developers have a slightly more meaningful ID than trying to remember that "14" is the Yen and "27" is the US Dollar. Since these keys are not exposed, they don't have to change so long as the developers remember that YEN was the currency before The Great Revolution. If a query is just for business logic, you may still be able to get away with not using a join. It's slower for some things but it's faster for others. YMMV.

Upvotes: 1

Gordon Linoff
Gordon Linoff

Reputation: 1269563

I would recommend using the symbol id, but it is close. This assumes that you really mean the currency abbreviation, rather than the symbol. I generally prefer surrogate numeric keys. If I have to use a string, then I want to avoid international characters.

One issue is dealing with currencies that may not be international standards or that may change over time. In the past 15 years, we have seen many currencies change, primarily to the euro. But you have other instances where, say, the Turkish lira was re-evaluated. So if you used your own definition, you might not distinguish between the two currencies.

Also, depending on your application, you may be calling something a "currency" when it is not an official currency. This happens when financial products are priced using some sort of basket of currencies (or other benchmark metric), but you want to treat "currency-basket bonds" the same way as other bonds in your system.

Because the issue of currencies is more complicated than I had once thought, I would lean to having a surrogate key to give the application more flexibility.

Upvotes: 1

Brian Leeming
Brian Leeming

Reputation: 11720

In both cases you need a join so you are not saving a join.

Option 1 adds an ID. This ID will default to have a clustered index. Meaning the data is sorted on disk with the lowest ID first and the highest ID at the end. This is a flexible option that will allow easy future development.

Option 2 will hard code the symbols into the Currency Pairs table. This means if at a later date you want to add another column to the symbols table, eg for grouping, you will need to create the symbol_id field and update all your records in the currency pairs table. This increases maintenance costs.

I always add int ID fields for this sort of table because the overhead is low and maintenance is easier.

There are also indexing advantages to option 1

Upvotes: 1

Related Questions