Kim
Kim

Reputation: 1010

Regular Expression - Get specific group value in SQL

I have the following regular expression problem.

Input:

123_321_009
3111_00_001
5123_123
555
666_A66
777_B77_777

Output request as below:

123_321
3111_00
5123
555
666_A66
777_B77

Is there any way to get the value of the output above?

I tried below statement but lack the idea how to get the value i needed.

^(.*?)\\s?([_0-9])?$

Value appearing after the last underscore are not needed.

Upvotes: 0

Views: 59

Answers (2)

Noel
Noel

Reputation: 10525

You can use REGEXP_REPLACE to remove the numbers following/including the last underscore.

SQL Fiddle

Query:

with x(y) as (
  select '123_321_009' from dual union all
  select '3111_00_001' from dual union all
  select '5123_123' from dual union all
  select '666_A66' from dual union all
  select '777_B77_777' from dual union all
  select '555' from dual
)
select y, regexp_replace(y,'_\d+$') substr
from x

Results:

|           Y |  SUBSTR |
|-------------|---------|
| 123_321_009 | 123_321 |
| 3111_00_001 | 3111_00 |
|    5123_123 |    5123 |
|     666_A66 | 666_A66 |
| 777_B77_777 | 777_B77 |
|         555 |     555 |

Pattern:

_       --matches an underscore
\d+     --matches one or more numbers
$       --matches end of the string

Effectively, this matches all the digits following/including the last underscore. Third parameter is regexp_replace is omitted. So, the pattern is removed and replaced by nothing.

Upvotes: 2

Auric Mason
Auric Mason

Reputation: 26

Since you a using Oracle, you can use a regex_replace(val,pattern,'') function with this pattern.

The following patterns would satisfy the samples you provided:

_[0-9]{3}$

_[0-9]*$

Here is a demonstration of this approach using SQL*Plus:

 SCOTT@dev> WITH tab(num_val) AS
      2    ( SELECT '123_321_009' FROM dual
      3    UNION ALL
      4    SELECT '3111_00_001' FROM dual
      5    UNION ALL
      6    SELECT '5123_123' FROM dual
      7    UNION ALL
      8    SELECT '555' FROM dual
      9    )
     10  SELECT tab.num_val,
     11    regexp_replace(tab.num_val,'_[0-9]{3}$') approach_1,
     12    regexp_replace(tab.num_val,'_[0-9]*$') approach_2
     13  FROM tab
     14  /

    NUM_VAL     APPROACH_1             APPROACH_2
    =========== ====================== ==================
    123_321_009 123_321                123_321
    3111_00_001 3111_00                3111_00
    5123_123    5123                   5123
    555         555                    555

If you provided a larger sampling (or a more specific rule), a more specific solution could be provided.

Upvotes: 1

Related Questions