Reputation: 21269
I am working on genealogical software that stores its data in SQLite3 format. Everything works fine, except for one minor detail. Not in all cases is the accuracy of the birth or death dates (etc) available to the exact day. So I have the following accuracies:
Now, assuming I store everything in a single column, I end up with a problem. Since SQLite3 has the Julian Day function I was thinking to encode the accuracy in the fractional part of the REAL
Julian Day (I don't need the hours anyway). That is fine, but it complicates the way SELECT
s work, in fact it means that stuff I could otherwise offload to SQLite3 has to be implemented in application code.
What would be a reasonable method to store the inaccurate dates and be able to query them quickly?
Note: if it matters to anyone answering, the language used is Python, but I am asking in general.
Upvotes: 2
Views: 183
Reputation: 180020
When doing queries on those date values, the most common operation probably is to check whether a date might match another date.
For this, you always need the start and the end of the interval, so it would make sense to store these two values in the DB.
(Call them Start
/End
or Min
/Max
or Earliest
/Latest
or whatever makes sense.)
For example, to find people who might have been born one century ago:
... WHERE '1913-04-16' BETWEEN BirthDateMin AND BirthDateMax
Inequality comparisons can be done with one of the interval boundaries. For example, to find people who might have been born more than one century ago:
... WHERE BirthDateMin < '1913-04-16'
Upvotes: 1
Reputation: 41170
Use two columns. One column is the approximate date, as accurate as possible, in SQLite format. The second column is the accuracy of the date in days. If the date is absolutely accurate, the second column is zero. If only the month is known, the date would be mid month and the second column 15 days. Etc. Date comparisons can be done by comparing against the date +/- the accuracy column.
Upvotes: 0
Reputation: 39480
Just because you're storing date information, doesn't mean that the built-in date type is the right one for you. Your data requirements (date inaccuracy) means that it's probably more accurate and better long-term to do some custom date-handling work, and avoid using the built-in date data types.
Upvotes: 0