How to store each occurrence of multiline string in array using bash regex

Question

Given a text file test.txt with contents:

hello
someline1
someline2
...
world1

line that shouldn't match

hello
someline1
someline2
...
world2

How can I store both of these multiline matches in separate array indexes?

I'm currently trying to use regex="hello.*world[12]"

Unfortunately I can only use native Bash, so Perl etc is off the table. Thanks

Fravadona · Accepted Answer

I would use awk and mapfile (bash version >= 4.3)

#!/bin/bash

mapfile -d '' arr < <(
    awk '/hello/{f=1} f; /world[12]/ && f {f=0; printf "\000"}' test.txt
)

arr=([0]=$'hello
someline1
someline2
...
world1
' [1]=$'hello
someline1
someline2
...
world2
')

notes:

awk '/hello/{f=1} f; /world[12]/ && f{f=0; printf "\000"}'
. when encountering hello, set the flag to true
. for each line, print it if the flag is true
. when encountering world[12] and the flag is true, set the flag to false and print a null-byte delimiter
mapfile -d '' arr
split the input into an array in which each element was delimited by a null-byte (instead of )

version for older bash:

#!/bin/bash
arr=()
while IFS='' read -r -d '' block
do
    arr+=( "$block" )
done < <(
    awk '/hello/{f=1} f; /world[12]/ && f{f=0; printf "\000"}' test.txt
)

How to store each occurrence of multiline string in array using bash regex

Answers (2)

Related Questions