Reputation: 517
I don't understand how to get 'nth match of pattern starting at end of string'. I've read, but cannot apply.
The column I'm working with is directory path names that look typically like:
I:\044\LOGFILE\aw_34\
I want to pull out the last directory name, and the 2nd to last. Using regexp_substr I am able to get the last directory using
SELECT REGEXP_SUBSTR(col_name, '\\[^\]+\\$')
I think what I'm asking here is: 'beginning at the end of the value in col_name, return the first instance of 1 or more non-backslash characters that lie between two backslashes'.
However, I'm not able to use any sensible combination of option parameters to get just the second folder name ('\LOGFILE\' in this example). I've tried:
SELECT REGEXP_SUBSTR(col_name, '\\[^\]+\\$', 1, 2)
returns NULL. It seems I'm not actually asking 'start at the end of the string and find the second occurrence of the pattern'. So, I've resorted to matching the pattern of the two last folders:
SELECT REGEXP_SUBSTR(col_name, '\\[^\]+\\[^\]+\\$')
then, wrapping this expression in a 2nd regex to get a match on a single folder beginning at the front.
That works, but doesn't help me understand the basic error in using
REGEXP_SUBSTR(col_name, '\\[^\]+\\$', 1, 2)
or another more 'direct' way of pulling out just the match I need ('\LOGFILE\' in this example). What's wrong?
Upvotes: 1
Views: 284
Reputation: 46365
Your attempted REGEXP_SUBSTR(col_name, '\\[^\]+\\$', 1, 2)
doesn't work because there are not two matches which end with the $
end-of-string - there can be at most one such match (any other match isn't at the end of the string, by definition).
I would try something like
REGEXP_SUBSTR(col_name, '(\\[^\]+){2}\\$')
then extract the first part of that... Note this is slightly different than what you had.
Alternatively, in later versions of Oracle (from version 11g onwards) there are ways to use capturing groups - a sixth argument to REGEX_SUBSTR
. See for example https://stackoverflow.com/a/7759146/1967396 which leads to
REGEXP_SUBSTR(col_name, '(\\[^\]+){2}\\$', 1, 1, NULL, 1)
to give the "contents of the first capturing group" - which is "the thing in parentheses in my regex" - i.e. \LOGIFLE
in your example (without the trailing \
though... since that belongs to the "next match").
Upvotes: 2
Reputation: 37520
To match the second to last folder, use a capture group...
SELECT REGEXP_SUBSTR(col_name, '(\\[^\]+\\)[^\]+\\$', 1, 1, NULL, 1)
The last parameter says get the first capture group, which is the thing in the parentheses in the regex pattern.
And REGEXP_SUBSTR(col_name, '\\[^]+\\$', 1, 2)
didn't work because $
matches end of the string so there won't be a second match.
Upvotes: 3