stefanB
stefanB

Reputation: 79930

How do I split a string on a delimiter in Bash?

I have this string stored in a variable:

IN="[email protected];[email protected]"

Now I would like to split the strings by ; delimiter so that I have:

ADDR1="[email protected]"
ADDR2="[email protected]"

I don't necessarily need the ADDR1 and ADDR2 variables. If they are elements of an array that's even better.


After suggestions from the answers below, I ended up with the following which is what I was after:

#!/usr/bin/env bash

IN="[email protected];[email protected]"

mails=$(echo $IN | tr ";" "\n")

for addr in $mails
do
    echo "> [$addr]"
done

Output:

> [[email protected]]
> [[email protected]]

There was a solution involving setting Internal_field_separator (IFS) to ;. I am not sure what happened with that answer, how do you reset IFS back to default?

RE: IFS solution, I tried this and it works, I keep the old IFS and then restore it:

IN="[email protected];[email protected]"

OIFS=$IFS
IFS=';'
mails2=$IN
for x in $mails2
do
    echo "> [$x]"
done

IFS=$OIFS

BTW, when I tried

mails2=($IN)

I only got the first string when printing it in loop, without brackets around $IN it works.

Upvotes: 3000

Views: 3765687

Answers (30)

F. Hauri  - Give Up GitHub
F. Hauri - Give Up GitHub

Reputation: 71027

Compatible answer

There are a lot of different ways to do this in .

However, it's important to first note that bash has many special features (so-called bashisms) that won't work in any other .

In particular, arrays, associative arrays, and pattern substitution, which are used in the solutions in this post as well as others in the thread, are bashisms and may not work under other shells that many people use.

For instance: on my Debian GNU/Linux, there is a standard shell called ; I know many people who like to use another shell called ; and there is also a special tool called with his own shell interpreter ().

For compatible answer, go to last part of this answer!

Requested string

The string to be split in the above question is:

IN="[email protected];[email protected]"

I will use a modified version of this string to ensure that my solution will correctly parse strings containing whitespace, which could break other solutions:

IN="[email protected];[email protected];Full Name <[email protected]>"

Split string based on delimiter in (version >=4.2)

In pure bash, we can create an array with elements split by a temporary value for IFS (the input field separator). The IFS, among other things, tells bash which character(s) it should treat as a delimiter between elements when defining an array:

IN="[email protected];[email protected];Full Name <[email protected]>"

# save original IFS value so we can restore it later
oIFS="$IFS"
IFS=";"
declare -a fields=($IN)
IFS="$oIFS"
unset oIFS

In newer versions of bash, prefixing a command with an IFS definition changes the IFS for that command only and resets it to the previous value immediately afterwards. This means we can do the above in just one line:

IFS=\; read -ra fields <<<"$IN"
# after this command, the IFS resets back to its previous value (here, the default):
set | grep ^IFS=
# IFS=$' \t\n'

We can see that the string IN has been stored into an array named fields, split on the semicolons:

set | grep ^fields=\\\|^IN=
# fields=([0]="[email protected]" [1]="[email protected]" [2]="Full Name <[email protected]>")
# IN='[email protected];[email protected];Full Name <[email protected]>'

(We can also display the contents of these variables using declare -p:)

declare -p IN fields
# declare -- IN="[email protected];[email protected];Full Name <[email protected]>"
# declare -a fields=([0]="[email protected]" [1]="[email protected]" [2]="Full Name <[email protected]>")

Note that read is the quickest way to do the split because there are no forks or external resources called.

Once the array is defined, you can use a simple loop to process each field (or, rather, each element in the array you've now defined):

# `"${fields[@]}"` expands to return every element of `fields` array as a separate argument
for x in "${fields[@]}" ;do
    echo "> [$x]"
    done
# > [[email protected]]
# > [[email protected]]
# > [Full Name <[email protected]>]

Or you could drop each field from the array after processing using a shifting approach, which I like:

while [ "$fields" ] ;do
    echo "> [$fields]"
    # slice the array 
    fields=("${fields[@]:1}")
    done
# > [[email protected]]
# > [[email protected]]
# > [Full Name <[email protected]>]

And if you just want a simple printout of the array, you don't even need to loop over it:

printf "> [%s]\n" "${fields[@]}"
# > [[email protected]]
# > [[email protected]]
# > [Full Name <[email protected]>]

Update: recent >= 4.4

In newer versions of bash, you can also play with the command mapfile:

mapfile -td \; fields < <(printf "%s\0" "$IN")

This syntax preserve special chars, newlines and empty fields!

If you don't want to include empty fields, you could do the following:

mapfile -td \; fields <<<"$IN"
fields[-1]=${fields[-1]%$'\n'}  # drop '\n' added on last field, by '<<<'

With mapfile, you can also skip declaring an array and implicitly "loop" over the delimited elements, calling a function on each:

myPubliMail() {
    printf "Seq: %6d: Sending mail to '%s'..." $1 "$2"
    # mail -s "This is not a spam..." "$2" </path/to/body
    printf "\e[3D, done.\n"
}

mapfile < <(printf "%s\0" "$IN") -td \; -c 1 -C myPubliMail

(Note: the \0 at end of the format string is useless if you don't care about empty fields at end of the string or they're not present.)

mapfile < <(echo -n "$IN") -td \; -c 1 -C myPubliMail

# Seq:      0: Sending mail to '[email protected]', done.
# Seq:      1: Sending mail to '[email protected]', done.
# Seq:      2: Sending mail to 'Full Name <[email protected]>', done.

But you could even use newline (mapfile's default) separator:

mapfile <<<"${IN//;/$'\n'}" -tc 1 -C myPubliMail

# Seq:      0: Sending mail to '[email protected]', done.
# Seq:      1: Sending mail to '[email protected]', done.
# Seq:      2: Sending mail to 'Full Name <[email protected]>', done.

Or you could use <<<, and in the function body include some processing to drop the newline it adds:

myPubliMail() {
    local seq=$1 dest="${2%$'\n'}"
    printf "Seq: %6d: Sending mail to '%s'..." $seq "$dest"
    # mail -s "This is not a spam..." "$dest" </path/to/body
    printf "\e[3D, done.\n"
}

mapfile <<<"$IN" -td \; -c 1 -C myPubliMail

# Renders the same output:
# Seq:      0: Sending mail to '[email protected]', done.
# Seq:      1: Sending mail to '[email protected]', done.
# Seq:      2: Sending mail to 'Full Name <[email protected]>', done.

Split string based on delimiter in

If you can't use bash, or if you want to write something that can be used in many different shells, you often can't use bashisms -- and this includes the arrays we've been using in the solutions above.

However, we don't need to use arrays to loop over "elements" of a string. There is a syntax used in many shells for deleting substrings of a string from the first or last occurrence of a pattern. Note that * is a wildcard that stands for zero or more characters:

(The lack of this approach in any solution posted so far is the main reason I'm writing this answer ;)

${var#*SubStr}  # drops substring from start of string up to first occurrence of `SubStr`
${var##*SubStr} # drops substring from start of string up to last occurrence of `SubStr`
${var%SubStr*}  # drops substring from last occurrence of `SubStr` to end of string
${var%%SubStr*} # drops substring from first occurrence of `SubStr` to end of string

As explained by Score_Under:

# and % delete the shortest possible matching substring from the start and end of the string respectively, and

## and %% delete the longest possible matching substring.

Using the above syntax, we can create an approach where we extract substring "elements" from the string by deleting the substrings up to or after the delimiter.

The codeblock below works well in (including Mac OS's bash), , , , , , and 's :

(Thanks to Adam Katz's comment, making this loop a lot simplier!)

IN="[email protected];[email protected];Full Name <[email protected]>"
while [ "$IN" != "$iter" ] ;do
    # extract the substring from start of string up to delimiter.
    iter=${IN%%;*}
    # delete this first "element" AND his separator, from $IN.
    IN="${IN#$iter;}"
    # Print (or doing anything with) the first "element".
    printf '> [%s]\n' "$iter"
done
# > [[email protected]]
# > [[email protected]]
# > [Full Name <[email protected]>]

Why not cut?

cut is useful for extracting columns in big files, but doing forks repetitively (var=$(echo ... | cut ...)) become quickly overkill!

Here is a correct syntax, tested under many using cut, as suggested by This other answer from DougW:

IN="[email protected];[email protected];Full Name <[email protected]>"
i=1
while iter=$(echo "$IN"|cut -d\; -f$i) ; [ -n "$iter" ] ;do
    printf '> [%s]\n' "$iter"
    i=$((i+1))
done

Comparing execution time.

splitByCut() {
    local i=1
    while iter=$(echo "$1"|cut -d\; -f$i) ; [ -n "$iter" ] ;do
        printf '> [%s]\n' "$iter"
        i=$((i+1))
    done
}

splitByMapFile() {
    iterMF() {
        local seq=$1 dest="${2%$'\n'}"
        [[ $2 ]] && printf "> [%s]\n" "$dest"
    }
    mapfile <<<"${1//;/$'\n'}" -tc 1 -C iterMF
}

Preparing 999 fields:

IN="[email protected];[email protected];Full Name <[email protected]>"
printf -v in40 %333s
in40=${in40// /$IN;}
in40=${in40%;}

Then

start=${EPOCHREALTIME/.};splitByMapFile "$in40" |
    md5sum;elap=00000$((${EPOCHREALTIME/.}-start))
printf 'Elapsed: %.4f secs.\n' ${elap::-6}.${elap: -6}
e35655f2a7fa367144a31f72f55e4dc0  -
Elapsed: 0.0454 secs.

start=${EPOCHREALTIME/.};splitByCut "$in40" |
    md5sum;elap=00000$((${EPOCHREALTIME/.}-start))
printf 'Elapsed: %.4f secs.\n' ${elap::-6}.${elap: -6}
e35655f2a7fa367144a31f72f55e4dc0  -
Elapsed: 2.2212 secs.

Where overall execution time is something like 49x longer, using 1 forks to cut, by field, for less than 1'000 fields!!

Upvotes: 377

DougW
DougW

Reputation: 30055

I've seen a couple of answers referencing the cut command, but they've all been deleted. It's a little odd that nobody has elaborated on that, because I think it's one of the more useful commands for doing this type of thing, especially for parsing delimited log files.

In the case of splitting this specific example into a bash script array, tr is probably more efficient, but cut can be used, and is more effective if you want to pull specific fields from the middle.

Example:

$ echo "[email protected];[email protected]" | cut -d ";" -f 1
[email protected]
$ echo "[email protected];[email protected]" | cut -d ";" -f 2
[email protected]

You can obviously put that into a loop, and iterate the -f parameter to pull each field independently.

This gets more useful when you have a delimited log file with rows like this:

2015-04-27|12345|some action|an attribute|meta data

cut is very handy to be able to cat this file and select a particular field for further processing.

Upvotes: 595

Sagar Deshmukh
Sagar Deshmukh

Reputation: 414

IN="[email protected];[email protected]"

ADDR1=$(echo $IN | tr ";" " " | awk '{print $1}')

ADDR2=$(echo $IN | tr ";" " " | awk '{print $2}')

Upvotes: 1

Alexander Stohr
Alexander Stohr

Reputation: 336

$ a=1234,5678,9101112
$ echo ${a%,*}
1234,5678
$ echo ${a%%,*}
1234
$ echo ${a#*,}
5678,9101112
$ echo ${a##*,}
9101112

Upvotes: 3

FelipeC
FelipeC

Reputation: 9508

Here's a simple POSIX-compatible answer nobody has mentioned:

sIFS=$IFS
IFS=';'
for e in $IN; do
    echo "[$e]"
done
IFS=$sIFS

If you want save it to an array to process later, here's a trick I learned from the git mailing list:

# store in $@
sIFS=$IFS
IFS=';'
set -- $IN
IFS=$sIFS

# traverse $@
for e; do
    echo "[$e]"
done

If all you want is to do some formatting, then:

sIFS=$IFS
IFS=';'
printf '[%s]\n' $IN
IFS=$sIFS

Simple, efficient, and portable.

Upvotes: 0

eukras
eukras

Reputation: 899

There are some cool answers here (errator esp.), but for something analogous to split in other languages -- which is what I took the original question to mean -- I settled on this:

IN="[email protected];[email protected]"
declare -a a="(${IN//;/ })";

Now ${a[0]}, ${a[1]}, etc, are as you would expect. Use ${#a[*]} for number of terms. Or to iterate, of course:

for i in ${a[*]}; do echo $i; done

IMPORTANT NOTE:

This works in cases where there are no spaces to worry about, which solved my problem, but may not solve yours. Go with the $IFS solution(s) in that case.

Upvotes: 10

tjb
tjb

Reputation: 11728

If you just want the first 200...

bash-4.2# for i in {1..200}; do echo $XXX | cut -d":" -f $i; done

Instead of "echo $XXX" you can also write a command like "echo $PATH" or something.

*since this is the 31st answer there should be some justification: it is unique in that it uses range and cut

Upvotes: 0

Jay Sullivan
Jay Sullivan

Reputation: 18299

Simple answer:

IN="[email protected];[email protected]"

IFS=';' read ADDR1 ADDR2 <<< "${IN}"

Sample output:

echo "${ADDR1}"  # prints "[email protected]"
echo "${ADDR2}"  # prints "[email protected]"

Upvotes: 5

sxddhxrthx
sxddhxrthx

Reputation: 792

So many answers and so many complexities. Try out a simpler solution:

echo "string1, string2" | tr , "\n"

tr (read, translate) replaces the first argument with the second argument in the input.

So tr , "\n" replace the comma with new line character in the input and it becomes:

string1
string2

Upvotes: 16

Eduardo Lucio
Eduardo Lucio

Reputation: 2487

Here's my answer!

DELIMITER_VAL='='

read -d '' F_ABOUT_DISTRO_R <<"EOF"
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=14.04
DISTRIB_CODENAME=trusty
DISTRIB_DESCRIPTION="Ubuntu 14.04.4 LTS"
NAME="Ubuntu"
VERSION="14.04.4 LTS, Trusty Tahr"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 14.04.4 LTS"
VERSION_ID="14.04"
HOME_URL="http://www.ubuntu.com/"
SUPPORT_URL="http://help.ubuntu.com/"
BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/"
EOF

SPLIT_NOW=$(awk -F$DELIMITER_VAL '{for(i=1;i<=NF;i++){printf "%s\n", $i}}' <<<"${F_ABOUT_DISTRO_R}")
while read -r line; do
   SPLIT+=("$line")
done <<< "$SPLIT_NOW"
for i in "${SPLIT[@]}"; do
    echo "$i"
done

Why this approach is "the best" for me?

Because of two reasons:

  1. You do not need to escape the delimiter;
  2. You will not have problem with blank spaces. The value will be properly separated in the array.

Upvotes: 2

user14389165
user14389165

Reputation:

ADDR1=${IN%%;*}
ADDR2=${IN##*;}

Upvotes: 0

Chris Lutz
Chris Lutz

Reputation: 75469

If you don't mind processing them immediately, I like to do this:

for i in $(echo $IN | tr ";" "\n")
do
  # process
done

You could use this kind of loop to initialize an array, but there's probably an easier way to do it.

Upvotes: 400

Christoph
Christoph

Reputation: 762

Edit: I'm sorry, I had read somewhere on SO that perl is required by POSIX, so I thought it would be legitimate to use it. But on unix.stackexchange.com, several users state perl is NOT part of the POSIX specification.

My solution: A function that uses perl's split to do the work.

With detailed comments:

#!/bin/bash

# This function is a wrapper for Perl's split.\
# \
# Since we cannot return an array like in Perl,
# it takes the name of the resulting array as last
# argument.\
# \
# See https://perldoc.perl.org/functions/split for usage info
# and examples.\
# \
# If you provide a Perl regexp that contains e. g. an escaped token like \b,
# space(s) and/or capture group(s), it must be quoted, and e. g. /\b/ must
# be single-quoted.\
# Thus, it's best to generally single-quote a Perl regexp.
function split # Args: <Element separator regexp> <string> <array name>
{
    (($# != 3)) && echo "${FUNCNAME[0]}: Wrong number of arguments, returning." && return 1

    local elementSepRE=$1
    local string=$2
    local -n array=$3

    local element i=0

    # Attention! read does Word Splitting on each line!
    # I must admit I didn't know that so far.
    # This removes leading and trailing spaces, exactly
    # what we don't want.
    # Thus, we set IFS locally to newline only.
    local IFS=$'\n'

    while read element; do
        # As opposed to array+=($element),
        # this preserves leading and trailing spaces.
        array[i++]=$element
    done <<<$(_perl_split)
}

# This function calls Perl's split function and prints the elements of the
# resulting array on separate lines.\
# It uses the caller's $elementSepRE and $string.
function _perl_split
{
    # A heredoc is a great way of embedding a Perl script.
    # N.B.: - Shell variables get expanded.
    #         - Thus:
    #           - They must be quoted.
    #           - Perl scalar variables must be escaped.
    #       - The backslash of \n must be escaped to protect it.
    #       - Instead of redirecting a single heredoc to perl, we may
    #         use multiple heredocs with cat within a command group and
    #         pipe the result to perl.
    #         This enables us to conditionally add certain lines of code.

    {
        cat <<-END
            my \$elementSepRE=q($elementSepRE);
        END

        # If $elementSepRE is a literal Perl regexp, qr must be applied
        # to it in order to use it.
        # N.B.: We cannot write this condition in Perl because when perl
        # compiles the script, all statements are checked for validity,
        # no matter if they will actually be executed or not.
        # And if $elementSepRE was e. g. == ', the line below – although
        # not to be executed – would give an error because of an unterminated
        # single-quoted string.
        [[ $elementSepRE =~ ^m?/ && $elementSepRE =~ /[msixpodualn]*$ ]] && cat <<-END
            \$elementSepRE=qr$elementSepRE;
        END

        cat <<-END
            my @array=split(\$elementSepRE, q($string));

            print(\$_ . "\\n") for (@array);
        END
    } | perl
}

And the same without comments for those who see at a glance what's going on ;)

#!/bin/bash

# This function is a wrapper for Perl's split.\
# \
# Since we cannot return an array like in Perl,
# it takes the name of the resulting array as last
# argument.\
# \
# See https://perldoc.perl.org/functions/split for usage info
# and examples.\
# \
# If you provide a Perl regexp that contains e. g. an escaped token like \b,
# space(s) and/or capture group(s), it must be quoted, and e. g. /\b/ must
# be single-quoted.\
# Thus, it's best to generally single-quote a Perl regexp.
function split # Args: <Element separator regexp> <string> <array name>
{
    (($# != 3)) && echo "${FUNCNAME[0]}: Wrong number of arguments, returning." && return 1

    local elementSepRE=$1
    local string=$2
    local -n array=$3

    local element i=0

    local IFS=$'\n'

    while read element; do
        array[i++]=$element
    done <<<$(_perl_split)
}

function _perl_split
{
    {
        cat <<-END
            my \$elementSepRE=q($elementSepRE);
        END

        [[ $elementSepRE =~ ^m?/ && $elementSepRE =~ /[msixpodualn]*$ ]] && cat <<-END
            \$elementSepRE=qr$elementSepRE;
        END

        cat <<-END
            my @array=split(\$elementSepRE, q($string));

            print(\$_ . "\\n") for (@array);
        END
    } | perl
}

Upvotes: -1

Johannes Schaub - litb
Johannes Schaub - litb

Reputation: 507373

You can set the internal field separator (IFS) variable, and then let it parse into an array. When this happens in a command, then the assignment to IFS only takes place to that single command's environment (to read ). It then parses the input according to the IFS variable value into an array, which we can then iterate over.

This example will parse one line of items separated by ;, pushing it into an array:

IFS=';' read -ra ADDR <<< "$IN"
for i in "${ADDR[@]}"; do
  # process "$i"
done

This other example is for processing the whole content of $IN, each time one line of input separated by ;:

while IFS=';' read -ra ADDR; do
  for i in "${ADDR[@]}"; do
    # process "$i"
  done
done <<< "$IN"

Upvotes: 1719

palindrom
palindrom

Reputation: 19111

Taken from Bash shell script split array:

IN="[email protected];[email protected]"
arrIN=(${IN//;/ })
echo ${arrIN[1]}                  # Output: [email protected]

Explanation:

This construction replaces all occurrences of ';' (the initial // means global replace) in the string IN with ' ' (a single space), then interprets the space-delimited string as an array (that's what the surrounding parentheses do).

The syntax used inside of the curly braces to replace each ';' character with a ' ' character is called Parameter Expansion.

There are some common gotchas:

  1. If the original string has spaces, you will need to use IFS:
  • IFS=':'; arrIN=($IN); unset IFS;
  1. If the original string has spaces and the delimiter is a new line, you can set IFS with:
  • IFS=$'\n'; arrIN=($IN); unset IFS;

Upvotes: 1645

Tong
Tong

Reputation: 2135

I think AWK is the best and efficient command to resolve your problem. AWK is included by default in almost every Linux distribution.

echo "[email protected];[email protected]" | awk -F';' '{print $1,$2}'

will give

[email protected] [email protected]

Of course your can store each email address by redefining the awk print field.

Upvotes: 160

Fil
Fil

Reputation: 25

Yet another late answer... If you are java minded, here is the bashj (https://sourceforge.net/projects/bashj/) solution:

#!/usr/bin/bashj

#!java

private static String[] cuts;
private static int cnt=0;
public static void split(String words,String regexp) {cuts=words.split(regexp);}
public static String next() {return(cnt<cuts.length ? cuts[cnt++] : "null");}

#!bash

IN="[email protected];[email protected]"

: j.split($IN,";")    # java method call

while true
do
    NAME=j.next()     # java method call
    if [ $NAME != null ] ; then echo $NAME ; else exit ; fi
done

Upvotes: -11

you can apply awk to many situations

echo "[email protected];[email protected]"|awk -F';' '{printf "%s\n%s\n", $1, $2}'

also you can use this

echo "[email protected];[email protected]"|awk -F';' '{print $1,$2}' OFS="\n"

Upvotes: 12

bisgardo
bisgardo

Reputation: 4630

The following Bash/zsh function splits its first argument on the delimiter given by the second argument:

split() {
    local string="$1"
    local delimiter="$2"
    if [ -n "$string" ]; then
        local part
        while read -d "$delimiter" part; do
            echo $part
        done <<< "$string"
        echo $part
    fi
}

For instance, the command

$ split 'a;b;c' ';'

yields

a
b
c

This output may, for instance, be piped to other commands. Example:

$ split 'a;b;c' ';' | cat -n
1   a
2   b
3   c

Compared to the other solutions given, this one has the following advantages:

  • IFS is not overriden: Due to dynamic scoping of even local variables, overriding IFS over a loop causes the new value to leak into function calls performed from within the loop.

  • Arrays are not used: Reading a string into an array using read requires the flag -a in Bash and -A in zsh.

If desired, the function may be put into a script as follows:

#!/usr/bin/env bash

split() {
    # ...
}

split "$@"

Upvotes: 15

Steven Lizarazo
Steven Lizarazo

Reputation: 5454

This worked for me:

string="1;2"
echo $string | cut -d';' -f1 # output is 1
echo $string | cut -d';' -f2 # output is 2

Upvotes: 178

kenorb
kenorb

Reputation: 166879

Here is a clean 3-liner:

in="foo@bar;bizz@buzz;fizz@buzz;buzz@woof"
IFS=';' list=($in)
for item in "${list[@]}"; do echo $item; done

where IFS delimit words based on the separator and () is used to create an array. Then [@] is used to return each item as a separate word.

If you've any code after that, you also need to restore $IFS, e.g. unset IFS.

Upvotes: 24

rashok
rashok

Reputation: 13484

IN="[email protected];[email protected]"
IFS=';'
read -a IN_arr <<< "${IN}"
for entry in "${IN_arr[@]}"
do
    echo $entry
done

Output

[email protected]
[email protected]

System : Ubuntu 12.04.1

Upvotes: 3

Emilien Brigand
Emilien Brigand

Reputation: 11021

Without setting the IFS

If you just have one colon you can do that:

a="foo:bar"
b=${a%:*}
c=${a##*:}

you will get:

b = foo
c = bar

Upvotes: 37

Petr &#218;jezdsk&#253;
Petr &#218;jezdsk&#253;

Reputation: 1269

Maybe not the most elegant solution, but works with * and spaces:

IN="bla@so me.com;*;[email protected]"
for i in `delims=${IN//[^;]}; seq 1 $((${#delims} + 1))`
do
   echo "> [`echo $IN | cut -d';' -f$i`]"
done

Outputs

> [bla@so me.com]
> [*]
> [[email protected]]

Other example (delimiters at beginning and end):

IN=";bla@so me.com;*;[email protected];"
> []
> [bla@so me.com]
> [*]
> [[email protected]]
> []

Basically it removes every character other than ; making delims eg. ;;;. Then it does for loop from 1 to number-of-delimiters as counted by ${#delims}. The final step is to safely get the $ith part using cut.

Upvotes: 1

gniourf_gniourf
gniourf_gniourf

Reputation: 46903

In Bash, a bullet proof way, that will work even if your variable contains newlines:

IFS=';' read -d '' -ra array < <(printf '%s;\0' "$in")

Look:

$ in=$'one;two three;*;there is\na newline\nin this field'
$ IFS=';' read -d '' -ra array < <(printf '%s;\0' "$in")
$ declare -p array
declare -a array='([0]="one" [1]="two three" [2]="*" [3]="there is
a newline
in this field")'

The trick for this to work is to use the -d option of read (delimiter) with an empty delimiter, so that read is forced to read everything it's fed. And we feed read with exactly the content of the variable in, with no trailing newline thanks to printf. Note that's we're also putting the delimiter in printf to ensure that the string passed to read has a trailing delimiter. Without it, read would trim potential trailing empty fields:

$ in='one;two;three;'    # there's an empty field
$ IFS=';' read -d '' -ra array < <(printf '%s;\0' "$in")
$ declare -p array
declare -a array='([0]="one" [1]="two" [2]="three" [3]="")'

the trailing empty field is preserved.


Update for Bash≥4.4

Since Bash 4.4, the builtin mapfile (aka readarray) supports the -d option to specify a delimiter. Hence another canonical way is:

mapfile -d ';' -t array < <(printf '%s;' "$in")

Upvotes: 40

Victor Choy
Victor Choy

Reputation: 4256

There is a simple and smart way like this:

echo "add:sfff" | xargs -d: -i  echo {}

But you must use gnu xargs, BSD xargs cant support -d delim. If you use apple mac like me. You can install gnu xargs :

brew install findutils

then

echo "add:sfff" | gxargs -d: -i  echo {}

Upvotes: 12

ajaaskel
ajaaskel

Reputation: 1709

IN='[email protected];[email protected];Charlie Brown <[email protected];!"#$%&/()[]{}*? are no problem;simple is beautiful :-)'
set -f
oldifs="$IFS"
IFS=';'; arrayIN=($IN)
IFS="$oldifs"
for i in "${arrayIN[@]}"; do
echo "$i"
done
set +f

Output:

[email protected]
[email protected]
Charlie Brown <[email protected]
!"#$%&/()[]{}*? are no problem
simple is beautiful :-)

Explanation: Simple assignment using parenthesis () converts semicolon separated list into an array provided you have correct IFS while doing that. Standard FOR loop handles individual items in that array as usual. Notice that the list given for IN variable must be "hard" quoted, that is, with single ticks.

IFS must be saved and restored since Bash does not treat an assignment the same way as a command. An alternate workaround is to wrap the assignment inside a function and call that function with a modified IFS. In that case separate saving/restoring of IFS is not needed. Thanks for "Bize" for pointing that out.

Upvotes: 2

18446744073709551615
18446744073709551615

Reputation: 16872

In Android shell, most of the proposed methods just do not work:

$ IFS=':' read -ra ADDR <<<"$PATH"                             
/system/bin/sh: can't create temporary file /sqlite_stmt_journals/mksh.EbNoR10629: No such file or directory

What does work is:

$ for i in ${PATH//:/ }; do echo $i; done
/sbin
/vendor/bin
/system/sbin
/system/bin
/system/xbin

where // means global replacement.

Upvotes: 2

fedorqui
fedorqui

Reputation: 290415

Apart from the fantastic answers that were already provided, if it is just a matter of printing out the data you may consider using awk:

awk -F";" '{for (i=1;i<=NF;i++) printf("> [%s]\n", $i)}' <<< "$IN"

This sets the field separator to ;, so that it can loop through the fields with a for loop and print accordingly.

Test

$ IN="[email protected];[email protected]"
$ awk -F";" '{for (i=1;i<=NF;i++) printf("> [%s]\n", $i)}' <<< "$IN"
> [[email protected]]
> [[email protected]]

With another input:

$ awk -F";" '{for (i=1;i<=NF;i++) printf("> [%s]\n", $i)}' <<< "a;b;c   d;e_;f"
> [a]
> [b]
> [c   d]
> [e_]
> [f]

Upvotes: 3

Michael Hale
Michael Hale

Reputation: 1407

A one-liner to split a string separated by ';' into an array is:

IN="[email protected];[email protected]"
ADDRS=( $(IFS=";" echo "$IN") )
echo ${ADDRS[0]}
echo ${ADDRS[1]}

This only sets IFS in a subshell, so you don't have to worry about saving and restoring its value.

Upvotes: 1

Related Questions