not-a-bug
not-a-bug

Reputation: 463

Postgres to_date() function returns wrong year

I used a query: select to_date(substring('0303653597' from 0 for 7),'DDMMYY')

Expected output: 1965-03-03
Actual output : 2065-03-03

When I give the string as 030370 or above it behaves correctly.
Whats wrong with the predefined function?
Can we use any other function to achieve the same?

Upvotes: 1

Views: 625

Answers (3)

Jim Jones
Jim Jones

Reputation: 19653

I believe you won't get a better answer than the one from @whites11, but you can try to fix it by parsing the strings and adding the century yourself:

WITH j (dt) AS (
  VALUES ('0303653597'),('0303701111'),('0510511111'),('0510051111')
)
SELECT
  CASE 
   WHEN substring(dt from 5 for 2)::int > '49' THEN
     to_date(substring(dt from 1 for 4) || '19' || substring(dt from 5 for 2), 'DDMMYYYY')
  ELSE 
     to_date(substring(dt from 1 for 4) || '20' || substring(dt from 5 for 2), 'DDMMYYYY')
  END
FROM j;

  to_date   
------------
 1965-03-03
 1970-03-03
 1951-10-05
 2005-10-05

Play with the case condition and see if it fits your needs.

Demo: db<>fiddle

Upvotes: 1

Ftisiot
Ftisiot

Reputation: 1868

A solution for this problem heavy relies on the fact that you can guarantee your dates are starting after a certain year for which the two numbers representation doesn't overlap with the years passed in 2000 century so far.

E.g. if the earliest datapoint is in 1930 you'll know that anything with YY less than 30 needs to be considered 2000+ and anything >30 needs to be considered 1900-1999. If you have entries also for years like 1919 the above problem is not solvable because any date with YY=[00-21] can't be uniquely associated.

However, if you can state that your dates can't be bigger than today's date then a possible solution is to check if the extracted date is bigger than today, if so, add a '19` prefix to the year, like in the example below

with dt as (
  select '0303653597' dt_str
  union all
  select '1705213597' dt_str
  union all
  select '1805213597' dt_str
)
select
to_date(
substring(dt_str from 0 for 5) || 
case when
  to_date(substring(dt_str from 0 for 7),'DDMMYY' ) <= current_date
  then '20' 
  else '19' 
end
|| substring(dt_str from 5 for 2)
,'DDMMYY')
from dt;

For the 3 dates (03/03/65, 17/05/21 and 18/05/21) as of today (17/05/21) the output will correctly be

  to_date   
------------
 1965-03-03
 2021-05-17
 1921-05-18
(3 rows)

Upvotes: 0

whites11
whites11

Reputation: 13300

According to the documentation:

In to_timestamp and to_date, if the year format specification is less than four digits, e.g., YYY, and the supplied year is less than four digits, the year will be adjusted to be nearest to the year 2020, e.g., 95 becomes 1995.

in your case 2065 is closer to 2020 than 1965 and thus it defaults to 2065.

Upvotes: 3

Related Questions