gulfy
gulfy

Reputation: 47

Converting extracted text string to date where string varies in length in Postgres

I have a materialized view of a text column that extracts a string of numbers representing a date.

The materialized view is created using the following function:

(regexp_replace(left(substring(lower(replace(content,' ','_')) from 're-inspection_date:_(.*)_'),10),'\D','','g'))

And outputs a text string in the format of MMDDYYYY except it does not account for leading zeroes for single-digit months and days.

When I try to use the "to_date" function specifying the format MMDDYYYY using the following:

(to_date(regexp_replace(left(substring(lower(replace(content,' ','_')) from 're-inspection_date:_(.*)_'),10),'\D','','g'),'MMDDYYYY'))

I get the error "date/time field value out of range: '12122018'".

I believe the issue is due to one or both of the following reasons:

Is there a way to make change these output strings to date format without changing my regexp?

If not, how could I change my regexp for extracting these values?

Bear in mind that the date I'm after in the source text is formatted as 12/1/2018 and also doesn't account for leading 0's in days or months. Also, there is another date preceding the target date in the text formatted the same way.

Here is a sample of the source text:

PLACEHOLDER TEXT FOR REDACTED STUFF BLAH BLAH BLAH **** Loremipsum 11/28/2018 4: 21: 37 PM ****1 of 2 Facility Information Permit Number: 12-34-56789 Name of Facility: Dolor sit amet-consectetur Address: 123 Fake Street City, Zip: adipiscing elit12345 RESULT: sed Do Eiusmod tempor: by 8: 00 AM Re-Inspection Date: 12/4/2018 Type: Blah-Type Stuff Etc: Dolor sit amet-consectetur...

Where the "Re-Inspection Date: 12/4/2018" is what I'm after.

I'm on Postgres 11.

Upvotes: 0

Views: 645

Answers (2)

Humpelstielzchen
Humpelstielzchen

Reputation: 6441

Kaushik Nayak is correct I guess. I get the same thing with this regex using a positive lookbehind (?<= Re-Inspection Date:) and allowing for any number of integers [0-9]* seperated with one slash /{1}

SELECT to_date(substring('string'
            from '(?<=Re-Inspection Date: )[0-9]*/{1}[0-9]*/{1}[0-9]*'), 'mm/dd/yyyy');

Upvotes: 1

Kaushik Nayak
Kaushik Nayak

Reputation: 31676

You may specify varying lengths of integers using the repetition {} pattern

select to_date(substring(lower(content) 
from  're-inspection date:\s*(\d{1,2}/\d{1,2}/\d{4})' ),'mm/dd/yyyy') from t

Demo

Upvotes: 0

Related Questions