Reputation: 1851
I have some data on fatalities which I'm trying to store, and I'm trying to come up with a reasonable scheme for storing the age of the person when they died.
I don't have DoB data for any of them, but I do have date of death generally (although not always very precisely) and I have data of varying accuracy for their age at death.
Some typical source data might be:
between 20 and 29 years old (or "in their 20s")
5 years old
2 months old
40 days old
adult
child
elderly
I have typically been storing this in three fields...
age_min (integer years)
age_max (integer years)
age_category (enum - baby, child, adult, elderly)
...but clearly this doesn't capture the 2 months old or 40 days old very well, both of which would simply end up as 0 years in my current schema, which is needlessly throwing away information.
It is very important that the database is honest about the precision to which information is known. So converting 2 months into 60 days, for example, would be a bad thing, because it implies a level of precision the source data didn't provide - converting it into 60-90 days might be ok.
I also considered adding a units field so I'd have...
age_min (integer)
age_min_unit (enum - days, months, years)
but the problem with this is it makes comparisons annoying. 24 months == 2 years, but dealing with that just makes a lot of code much more complex than I suspect it needs to be.
I could store all ages in days, with a min and a max, but then the complexity becomes converting that back into something human readable which isn't clunky and doesn't express a greater degree of precision than I actually have.
So for example, 40 days might end up being rendered at 1 month, 10 days which is actually a little less precise than saying 40 days.
Upvotes: 1
Views: 483
Reputation: 471
Been there, done that. The least ambiguous and easiest to process is to convert everything to days and add a +/- tolerance. That way everything can be stored in 2 fields and all situations are covered. Obviously you have to convert to human readable format before display.
If you have date of birth and date of death the tolerance becomes 0.
Thus the following input values will yield the indicated stored values.
5 years: 2007 183 (ie. 5.5 x 365 = 2007 days. 365/2 = +/-183 days.)
2 months: 75 15
9 years 7 months: 3512 15
child: First value is midpoint of your preferred "child" age range in days. (1-12?, 3-18?). Tolerance is half that.
baby: Same again. Decide on what constitutes a "baby" (0-2?) and generate the values accordingly.
Upvotes: 1
Reputation: 7590
Store the value as min+max+unit. 'adult','child'... etc can be represented as a unit of age for which the min and max would be ignored.
Then you need to find the answer to philosophical questions like "Who is older: a child or a person between 5 and 12 years old?".
When you have the answer to those for all of the possible combo's of age types you will be able to tell if it's possible to use a canonical representation of the age (e.g. days) for comparing.
If its possible - you can add an additional field with the age in days (or seconds, or something...) to use for comparing/sorting. The compare field can be calculated with a trigger, or in the app.
If its not possible - you will need a custom comparator for sorting, afaik that can't be done in MySQL so you will probably have to do all sorting and comparing in the app.
Upvotes: 0
Reputation: 1803
Ok just adding it answer for future
Can you try to use the age_min and age_max in days and also carry one more field as "human_readable_age_text" which reads , say "40 days"
Upvotes: 1