Allan Bowe
Allan Bowe

Reputation: 12691

Best practice for SCD date pairs (closing / opening timestamps)

When implementing temporal datetime pairs, which is the better approach?

Approach 1

_RK   FROM_DTTM              TO_DTTM 
1     01-JAN-2012 00:00:00   31-DEC-2012 23:59:59 ------------> Old record closed out 
1     01-JAN-2013 00:00:00   31-DEC-4949 23:59:59 ------------> New active record

Used with a filter clause as follows:

 where FROM_DTTM <= &FILTER_DATE <= TO_DTTM

Approach 2

_RK   FROM_DTTM              TO_DTTM 
1     01-JAN-2012 00:00:00   01-JAN-2013 00:00:00 ------------> Old record closed out 
1     01-JAN-2013 00:00:00   31-DEC-4949 23:59:59 ------------> New active record

Used with a filter clause as follows:

 where FROM_DTTM <= &FILTER_DATE < TO_DTTM

Considerations:

Approach 1 is compatible with BETWEEN clause syntax (inclusive range). For this reason I think that it is the best approach, however - and this is my concern - there will always exist a gap between the closing and opening timestamps! Eg the following will (incorrectly from a business perspective) return nothing:

 where FROM_DTTM <= '31-DEC-2012 23:59:59.1'dt <= TO_DTTM

Different databases will have differing levels of precision here. I guess there would have to be standard (always round to the nearest second when filtering) but just wondered if there were any other reasons to support Approach 1 (such as proposed bi-temporal 'AS OF' syntax compatibility)? Or indeed, whether (and why) Approach 2 would be preferable..

Upvotes: 5

Views: 3357

Answers (1)

Damien_The_Unbeliever
Damien_The_Unbeliever

Reputation: 239636

For continua (such as datetime, or numeric measures), I'd almost always recommend the semi-open interval approach (Approach 2). As you say, approach 1 tends to produce gaps or you need to know the exact precision to compute the end point. Calculating the end points in approach 2 always tends to be simpler.

The "benefit" of being able to use BETWEEN is, so far as I'm concerned, a very minor one compared with being able to quickly assure oneself of the correctness of the query.

For non-continuous data (dates without times, or discrete numerics) I might revert to using a closed interval.

Upvotes: 3

Related Questions