INSTR/SUBTR to filter string in Oracle SQL

I have a string that appears as:

00012345678 Rain, Kip

I would like to filter out the first numbers/integers, then re-arrange the first and last name.

Kip Rain

I was thinking that I could do INSTR({string},',','1') to get to the first comma, but I am unsure how to do both numbers and punctuation in one line. Would I have to chain the INSTR?

Thanks for your help!

Upvotes: 0

Answers (3)

Ben

Reputation: 52853

You can chain them; but with complicated things this quickly becomes confusing to work out what's happening. Unless you have demonstrable performance concerns it's often quicker to use regular expressions. In this case, it's probably easiest to use REGEXP_REPLACE()

select regexp_replace(your_string
                , '[^[:alpha:]]+([[:alpha:]]+)[^[:alpha:]]+([[:alpha:]]+)'
                , '\2 \1')
  from ...

The second parameter is the match string; in this case we're searching for everything that is not an alphabetic character ([^[:alpha:]]) 1 or more times (+), followed by alphabetic characters ([[:alpha:]]) 1 or more times. This is repeated to take into account the spaces and comma; and would match your string as follows:

|string        | matched by     |
+--------------+----------------+
|'00012345678 '| [^[:alpha:]]+  |
|'Rain'        | ([[:alpha:]]+) |
|', '          | [^[:alpha:]]+  |
|'Kip'         | ([[:alpha:]]+) |

The parenthesis here represent groups; the first set the first group etc...

The third parameter of REGEXP_REPLACE() tells Oracle what to replace your string with; this where the groups come in - you can replace groups in any order. In this instance I want the second group (Kip), followed by a space, followed by the first group (Rain).

You can see this all demonstrated in this SQL Fiddle

Upvotes: 1

Patrick Bacon

Reputation: 4640

I am biased towards using the regular expression variation of the substr function.

First obtain a repeating list of non-numeric characters as follows:

REGEXP_SUBSTR('00012345678 Rain, Kip','([[:alpha:]]|[-])+',1,1)

where [[:alpha:]] is a character class where all alphabetic characters are included. The bracketed expression, [-], is just a matching list which is my way of identifying that the last name, Rain, could include a hyphen. The alternation operator, '|', states that either the alphabetic or hyphen characters are acceptable. The '+' indicates that we are looking to match one or more occurrences.

Second, obtain the last non-numeric characters at the end of the string:

REGEXP_SUBSTR('00012345678 Rain, Kip','[^, ]+$',1,1)

Here, I am going to the end of the string (using the anchor, '$'), and find all character after the comma and space.

Next I combine (with a space in between) using the concatenator operator, ||.

REGEXP_SUBSTR('00012345678 Rain, Kip','[^, ]+$',1,1) ||' ' || REGEXP_SUBSTR('00012345678 Rain, Kip','([[:alpha:]]|[ -])+',1,1)

Upvotes: 0

vav

Reputation: 4684

Yes, it is alright to chain them:

substr(str, 1, instr(str, ' ')) number_part
substr(str, instr(str, ' '), instr(str, ',') - instr(str, ' ')) Kip
substr(str, instr(str, ' ', 2), len(str)) Rain

In last example you may use something more preceise than len(str) if your string is longer.

Upvotes: 0

INSTR/SUBTR to filter string in Oracle SQL

Answers (3)

Related Questions