user2276280
user2276280

Reputation: 601

Remove substrings that vary in value in Oracle

I have a column in Oracle which can contain up to 5 separate values, each separated by a '|'. Any of the values can be present or missing. Here are come examples of how the data might look:

100-1
10-3|25-1|120/240
15-1|15-3|15-2|120/208
15-1|15-3|15-2|120/208|STA-2
112-123|120/208|STA-3

The values are arbitrary except for the order. The numerical values separated by dashes always come first. There can be 1 to 3 of these values present. The numerical values separated by a slash (if it is present) is next. The string, 'STA', and a numerical value separated by a dash is always last, if it is present.

What I would like to do is reformat this column to only ever include the first three possible values, those being the three numerical values separated by dashes. Afterwards, I want to replace 2nd numeric in each value (the numeric after the dash) using the following pattern:

1 = A
2 = B
3 = C

I would also like to remove the dash afterwards, but not the '|' that separates the values unless there is a trailing '|'.

To give you an idea, here's how the values at the beginning of the post would look after the reformatting:

100A
10C|25A
15A|15C|15B
15A|15C|15B
112ABC

I'm thinking this can be done with regex expressions but it's got me a little confused. Does anyone have a solution?

Upvotes: 0

Views: 41

Answers (2)

Adam Silenko
Adam Silenko

Reputation: 3108

you can do this in one line, but you can write simple function to do that

SELECT str, REGEXP_REPLACE(str,'(\|\d+\/\d+)?(\|STA-\d+)?','') cut
, REGEXP_REPLACE(REGEXP_REPLACE(str,'(\|\d+\/\d+)?(\|STA-\d+)?',''), '(\-)([1,2]*)(3)([1,2]*)', '\1\2C\4') rep3toC
, REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(str,'(\|\d+\/\d+)?(\|STA-\d+)?',''), '(\-)([1,2]*)(3)([1,2]*)', '\1\2C\4'), '(\-)([1,C]*)(2)([1,C]*)', '\1\2B\4') rep2toB
, REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(str,'(\|\d+\/\d+)?(\|STA-\d+)?',''), '(\-)([1,2]*)(3)([1,2]*)', '\1\2C\4'), '(\-)([1,C]*)(2)([1,C]*)', '\1\2B\4'), '(\-)([B,C]*)(1)([B,C]*)', '\1\2A\4') rep1toA
, REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(REGEXP_REPLACE(str,'(\|\d+\/\d+)?(\|STA-\d+)?',''), '(\-)([1,2]*)(3)([1,2]*)', '\1\2C\4'), '(\-)([1,C]*)(2)([1,C]*)', '\1\2B\4'), '(\-)([B,C]*)(1)([B,C]*)', '\1\2A\4'), '-', '') "rep-"
FROM (
SELECT '100-1' str FROM dual UNION 
SELECT '10-3|25-1|120/240' str FROM dual UNION 
SELECT '15-1|15-3|15-2|120/208' str FROM dual UNION 
SELECT '15-1|15-3|15-2|120/208|STA-2' str FROM dual UNION 
SELECT '112-123|120/208|STA-3' FROM dual     
) tab

Upvotes: 0

user2705585
user2705585

Reputation:

If I have to solve this problem I will solve it in following ways.

SELECT
  REGEXP_REPLACE(column,'\|\d+\/\d+(\|STA-\d+)?',''),
  REGEXP_REPLACE(column,'(\d+)-(1)([^\d])','\1A\3'),
  REGEXP_REPLACE(column,'(\d+)-(2)([^\d])','\1B\3'),
  REGEXP_REPLACE(column,'(\d+)-(3)([^\d])','\1C\3'),
  REGEXP_REPLACE(column,'(\d+)-(123)([^\d])','\1ABC')
FROM table;

Explanation: Let us break down each REGEXP_REPLACE statement one by one.

  1. REGEXP_REPLACE(column,'\|\d+\/\d+(\|STA-\d+)?','')
    This will replace the end part like 120/208|STA-2 with empty string so that further processing is easy.

  2. Finding match was easy but replacing A for 1, B for 2 and C for 3 was not possible ( as per my knowledge ) So I did those matching and replacements separately.

    In each regex from second statement (\d+)-(yourNumber)([^\d]) first group is number before - then yourNumber is either 1,2,3 or 123 followed by |.

    So the replacement will be according to yourNumber.

All demos here from version 1 to 5.

Note:- I have just done replacement for combination of yourNUmber for those present in question. You can do likewise for other combinations too.

Upvotes: 1

Related Questions