Francesco Marchioni
Francesco Marchioni

Reputation: 4328

Split string into array using multi-character delimiter

I need to split a string into an array. My problem is that the delimiter is a 3 character one: _-_

For example:

db2-111_-_oracle12cR1RAC_-_mariadb101

I need to create the following array:

db2-111
oracle12cR1RAC
mariadb101

Similar questions followed this approach:

str="db2-111_-_oracle12cR1RAC_-_mariadb101"
arr=(${str//_-_/ })
echo ${arr[@]}

Even if the array is created, it has been split incorrectly:

db2 
111 
oracle12cR1RAC 
mariadb101

It seems that the "-" character in the first item causes the array's split function to fail.

Can you suggest a fix for it? Thanks.

Upvotes: 4

Views: 2420

Answers (5)

RARE Kpop Manifesto
RARE Kpop Manifesto

Reputation: 2807

<<<'db2-111_-_oracle12cR1RAC_-_mariadb101' | 

{m,g}awk NF=NF FS='_[-]_' OFS='\n'

db2-111
oracle12cR1RAC
mariadb101

if you like the fringe but ultra concise RS syntax, it's

mawk ~ RS='_-_|\n'

   or

mawk \$_ RS='_-_|\n'

   or simply

mawk RS RS='_-_|\n'

db2-111
oracle12cR1RAC
mariadb101

Upvotes: 2

anubhava
anubhava

Reputation: 784898

Here is a solution using replacement of _-_ with a NUL byte since we cannot make a safe assumption that some character like # or ; or : will not be present in input strings.

readarray -d '' arr < <(
   awk -F'_-_' -v OFS='\0' '{ORS=OFS; $1=$1} 1' <<< "$str")

declare -p arr
declare -a arr=([0]="db2-111" [1]="oracle12cR1RAC" [2]="mariadb101")

Note that due to use of readarray it will require BASH ver 4+

Upvotes: 0

stack0114106
stack0114106

Reputation: 8711

Using Perl one-liner

$ echo "db2-111_-_oracle12cR1RAC_-_mariadb101" | perl -F/_-_/ -ne ' { print "$F[0]\n$F[1]\n$F[2]" } '
db2-111
oracle12cR1RAC
mariadb101

Upvotes: 0

Aserre
Aserre

Reputation: 5062

You could use sed to do what you want, i.e. writting something like that :

str="db2-111_-_oracle12cR1RAC_-_mariadb101"
arr=($(sed 's/_-_/ /g' <<< $str))
echo ${arr[0]}

Edit :

The reason arr=(${str//_-_/ }) didn't work is that when you write it like that, everything inside ${ ... } is considered as 1 element of the array. So, using sed, or even simply arr=($(echo ${str//_-_/ })) will produce the result you expect.

Upvotes: 2

chepner
chepner

Reputation: 530862

If you can, replace the _-_ sequences with another single character that you can use for field splitting. For example,

$ str="db2-111_-_oracle12cR1RAC_-_mariadb101"
$ str2=${str//_-_/#}
$ IFS="#" read -ra arr <<< "$str2"
$ printf '%s\n' "${arr[@]}"
db2-111
oracle12cR1RAC
mariadb101

Upvotes: 3

Related Questions