Reputation: 821
I have a column in which a string starts with - 'Chicago, IL, April 20, 2015 — and so on text here'. I want to extract the Date part from this string in Oracle. Any ideas on how to do this. I was able to find something for mm/dd/yyyy like below, but not for long date format.
SELECT REGEXP_SUBSTR(' the meeting will be on 8/8/2008', '[0-9]{1,}/[0-9]{1,}/[0-9]{2,}') FROM dual
Upvotes: 0
Views: 12804
Reputation: 5442
If your columns value is always start with 'Chicago, IL, April 20, 2015 — and so on text here'
then you could simly use SUBSTR
instead of REGEXP_SUBSTR
SELECT
SUBSTR(column_name
,INSTR(column_name, ',', 1, 2) + 1
,INSTR(column_name, '—') - INSTR(column_name, ',', 1, 2) - 1
)
FROM
dual;
If not then you could use REGEXP_SUBSTR
as other answer mention, my original answer is wrong as @MTO
comment
Upvotes: 1
Reputation: 167981
You could use:
SELECT TO_DATE(
REGEXP_SUBSTR(
'Chicago, IL, April 20, 2015 — and so on text here',
'(JANUARY|FEBRUARY|MARCH|APRIL|MAY|JUNE|JULY|AUGUST|SEPTEMBER|'
|| 'OCTOBER|NOVEMBER|DECEMBER)'
|| '[[:space:]]+([012]?[0-9]|3[01])'
|| '[[:punct:][:space:]]+\d{4}',
1,
1,
'i'
),
'MONTH DD YYYY'
)
FROM DUAL;
If you want to validate the dates as well (so you don't get an error for February 29, 2001
) then you could use a user-defined function:
CREATE FUNCTION parse_Date(
in_string VARCHAR2,
in_format VARCHAR2 DEFAULT 'YYYY-MM-DD',
in_nls_params VARCHAR2 DEFAULT NULL
) RETURN DATE DETERMINISTIC
AS
BEGIN
RETURN TO_DATE( in_string, in_format, in_nls_params );
EXCEPTION
WHEN OTHERS THEN
RETURN NULL;
END;
/
And replace the TO_DATE( ... )
function with PARSE_DATE( ... )
Upvotes: 2
Reputation: 3126
Well, you can take a direct approach and use a regular expression like in the example that you've found:
SELECT
REGEXP_SUBSTR('Chicago, IL, April 20, 2015 - etc etc', '(January|February|March|April|May|June|July|August|September|October|November|December) [0-9]{1,2}, [0-9]{4}')
FROM dual;
But this will only work properly if all the dates are in the exact same format. Full month name with first letter uppercased, space, day, comma, space, 4-digit year. If there can be more than one space or no space at all, use \s*
instead of spaces in the regular expression. If the month name isn't necessarily initcap, use initcap()
on source or case-insensitive flag for regexp_substr
function.
Additionally, this will catch bogus dates that fit the format, like "April 99, 1234", you'll have to filter them later.
Upvotes: 1