Gaurav S
Gaurav S

Reputation: 1009

REGEXP Substring Oracle

I'm looking for a regexp to get the correct file name from the below string.

Input

Report_18072014.pdf

Expected Output

Report.pdf

The query so far which gives me the correct output.

REPLACE(FILENAME, '_' || TRIM(REGEXP_SUBSTR (FILENAME, '[^_.]+',1,2))) 

My question is if there is a better regexp I can use to avoid concatenating _ to the substring in the replace function.

Upvotes: 1

Views: 360

Answers (2)

Doug
Doug

Reputation: 3863

Try using this [untested]:

REGEXP_REPLACE(FILENAME, '_[[:digit:]]*\.', '.')

I really enjoy using regular expressions, but I wish that Oracle used the PCRE format instead of the POSIX format.

Explanation:

  • _[[:digit:]]* matches the underscore and 0 or more digits.
  • . matches the period to make sure that you are beside the file extension
  • The final , '.' replaces everything with a period so that you have still have the .PDF

I would have changed the replacement to '.PDF', but I don't want to force your file names to mixed case files.

Upvotes: 1

zx81
zx81

Reputation: 41838

With your file format, you can use something like this:

REGEXP_REPLACE(FILENAME, '^([[:alnum:]]+)_[^.]+(\.[^.]+)',
                         '\1\2',
                         1, 0, 'c')

Explanation

  • The ^ anchor asserts that we are at the beginning of the string
  • ([[:alnum:]]+) captures letters to Group 1.
  • [^.]+ matches any chars that are not a .
  • (\.[^.]+) captures the extension to Group 2 (the dot and any chars that are not a dot)
  • \1\2 replaces with Group 1, Group 2

Upvotes: 2

Related Questions