MarvinLeRouge
MarvinLeRouge

Reputation: 534

Quickest way to split a string in bash

The goal: produce a path from an integer. I need to split strings in fixed length (2 characters in this case), and then glue the pieces with a separator. Example : 123456 => 12/34/56, 12345 => 12/34/5.

I found a solution with sed:

sed 's/\(..\)/\1\//g'

but I'm not sure it's really quick, since I'm really not searching for any analysis of the string content (which will always be an integer, if it's any importance), but really to split it in length 2 (or 1 if the original length is odd).

Upvotes: 3

Views: 4166

Answers (4)

Benjamin W.
Benjamin W.

Reputation: 52291

You can split a string into elements using the fold command, read the elements into an array with readarray and process substitution, and then insert the field separator using IFS:

$ var=123456
$ readarray -t arr < <(fold -w2 <<< "$var")
$ (IFS=/; echo "${arr[*]}")
12/34/56

I put the last command in a subshell so the change to IFS is not persistent.

Notice that the [*] syntax is required here, or IFS won't be used as the output separator, i.e., the usually preferred [@] wouldn't work.

readarray and its synonym mapfile require Bash 4.0 or newer.

This works with an odd number of elements as well:

$ var=12345
$ readarray -t arr < <(fold -w2 <<< "$var")
$ (IFS=/; echo "${arr[*]}")
12/34/5

Upvotes: 0

sahaquiel
sahaquiel

Reputation: 1838

TL;DR

sed is enough fast.


If we are talking about speed, let's check. I think sed is the shorted solution, but as example I'll take @choroba's shell script:

$ wc -l hugefile 
10877493 hugefile

Sed:

sed 's/\(..\)/\1\//g' hugefile

Output:

real    0m25.432s
user    0m8.731s
sys 0m10.123s

Script:

#!/bin/bash
while IFS='' read -r s ; do
    o=""
    for (( pos=0 ; pos<${#s} ; pos+=2 )) ; do
        o+=${s:pos:2}/
    done
    o=${o%/}
    echo "$o"
done < hugefile

Working really long time, I've interrupted it at:

real    1m19.480s
user    1m14.795s
sys 0m4.683s

So on my PC Intel(R) Core(TM) i5-7500 CPU @ 3.40GHz, MemTotal: 16324532 kB, sed making around 426568 (close for half a million) string modifications per second. Seems like fast enough

Upvotes: 1

choroba
choroba

Reputation: 241978

Use parameter substitution. ${var:position:length} extracts substrings, ${#var} returns length of the value, ${var%final} removes "final" from the end of the value. Run in in a loop for strings of unknown length:

#!/bin/bash
for s in 123456 1234567 ; do
    o=""
    for (( pos=0 ; pos<${#s} ; pos+=2 )) ; do
        o+=${s:pos:2}/
    done
    o=${o%/}
    echo "$o"
done

Upvotes: 2

Nahuel Fouilleul
Nahuel Fouilleul

Reputation: 19315

bash expansion can do substring

var=123456
echo "${var:0:2}"  # 2 first char
echo "${var:2:2}"  # next two
echo "${var:4:2}"  # etc.

joinning manually with /

echo "${var:0:2}/${var:2:2}/${var:4:2}"

Upvotes: 4

Related Questions