Reputation: 47
I have a materialized view of a text column that extracts a string of numbers representing a date.
The materialized view is created using the following function:
(regexp_replace(left(substring(lower(replace(content,' ','_')) from 're-inspection_date:_(.*)_'),10),'\D','','g'))
And outputs a text string in the format of MMDDYYYY except it does not account for leading zeroes for single-digit months and days.
When I try to use the "to_date" function specifying the format MMDDYYYY using the following:
(to_date(regexp_replace(left(substring(lower(replace(content,' ','_')) from 're-inspection_date:_(.*)_'),10),'\D','','g'),'MMDDYYYY'))
I get the error "date/time field value out of range: '12122018'".
I believe the issue is due to one or both of the following reasons:
The resulting strings from my current regexp in the materialized view vary in length (e.g. 12212018 8222018 962018) due to my regexp removing all non-integer characters. The dates are 6, 7 or 8 digits long.
As a result, I haven't yet been able to come up with a way of inserting a delimiter between the month/day/year values.
Is there a way to make change these output strings to date format without changing my regexp?
If not, how could I change my regexp for extracting these values?
Bear in mind that the date I'm after in the source text is formatted as 12/1/2018 and also doesn't account for leading 0's in days or months. Also, there is another date preceding the target date in the text formatted the same way.
Here is a sample of the source text:
PLACEHOLDER TEXT FOR REDACTED STUFF BLAH BLAH BLAH **** Loremipsum 11/28/2018 4: 21: 37 PM ****1 of 2 Facility Information Permit Number: 12-34-56789 Name of Facility: Dolor sit amet-consectetur Address: 123 Fake Street City, Zip: adipiscing elit12345 RESULT: sed Do Eiusmod tempor: by 8: 00 AM Re-Inspection Date: 12/4/2018 Type: Blah-Type Stuff Etc: Dolor sit amet-consectetur...
Where the "Re-Inspection Date: 12/4/2018" is what I'm after.
I'm on Postgres 11.
Upvotes: 0
Views: 645
Reputation: 6441
Kaushik Nayak is correct I guess. I get the same thing with this regex using a positive lookbehind (?<= Re-Inspection Date:) and allowing for any number of integers [0-9]* seperated with one slash /{1}
SELECT to_date(substring('string'
from '(?<=Re-Inspection Date: )[0-9]*/{1}[0-9]*/{1}[0-9]*'), 'mm/dd/yyyy');
Upvotes: 1
Reputation: 31676
You may specify varying lengths of integers using the repetition {}
pattern
select to_date(substring(lower(content)
from 're-inspection date:\s*(\d{1,2}/\d{1,2}/\d{4})' ),'mm/dd/yyyy') from t
Upvotes: 0