Reputation: 73
My previous question was flagged "duplicate" and I was pointed to this and this. The solutions provided on those threads does not solve this at all.
Content of file.txt:
Some line of text 0
Some line of text 1
Some line of text 2
PATTERN1
Some line of text 3
Some line of text 4
Some line of text 5
PATTERN2
Some line of text 6
Some line of text 7
Some line of text 8
PATTERN1
Some line of text 9
Some line of text 10
Some line of text 11
PATTERN2
Some line of text 12
Some line of text 13
Some line of text 14
I need to extract "PATTERN1" and "PATTERN2" + lines in between, and the following command does this perfectly:
awk '/PATTERN1 /,/PATTERN2/' ./file.txt
Output:
PATTERN1 Some line of text 3 Some line of text 4 Some line of text 5 PATTERN2 PATTERN1 Some line of text 9 Some line of text 10 Some line of text 11 PATTERN2
But now I am trying to create a bash script that:
To clarify. Means store the following lines inside the quotes:
"PATTERN1
Some line of text 3
Some line of text 4
Some line of text 5
PATTERN2"
to array[0]
and store the following lines inside the quotes:
"PATTERN1
Some line of text 9
Some line of text 10
Some line of text 11
PATTERN2"
to array[1]
and so on..... if there are more occurrence of PATTERN1 and PATTERN2
What I currently have:
#!/bin/bash
var0=`cat ./file.txt`
mapfile -t thearray < <(echo "$var0" | awk '/PATTERN1 /,/PATTERN2/')
The above does not work.
And as much as possible I do not want to use mapfile, because the script might be executed on a system that does not support it.
Based on this link provided:
myvar=$(cat ./file.txt)
myarray=($(echo "$var0" | awk '/PATTERN1 /,/PATTERN2/'))
But when I do echo ${myarray[1]}
I get a blank response.
And when I do echo ${myarray[0]}
I get:
PATTERN1 Some line of text 3 Some line of text 4 Some line of text 5 PATTERN2 PATTERN1 Some line of text 9 Some line of text 10 Some line of text 11 PATTERN2
What I expect when I do echo ${myarray[0]}
PATTERN1 Some line of text 3 Some line of text 4 Some line of text 5 PATTERN2
What I expect when I do echo ${myarray[1]}
PATTERN1 Some line of text 9 Some line of text 10 Some line of text 11 PATTERN2
Any help will be great.
Upvotes: 1
Views: 926
Reputation: 15273
As Charles suggested...
while IFS= read -r -d '' x; do array+=("$x"); done < <(awk '
/PATTERN1/,/PATTERN2/ { if ( $0 ~ "PATTERN2" ) { x=$0; printf "%s%c",x,0; next }
print }' ./file.txt)
I reformatted it. It was getting kinda busy and hard to read.
And to test it -
$: echo "[${array[1]}]"
[PATTERN1
Some line of text 9
Some line of text 10
Some line of text 11
PATTERN2]
As an aside, it seems very odd to me to include the redundant sentinel values in the data elements, so if you want to strip those:
$: while IFS= read -r -d '' x; do array+=("$x"); done < <(
awk '/PATTERN1/,/PATTERN2/{ if ( $0 ~ "PATTERN1" ) { next }
if ( $0 ~ "PATTERN2" ) { len--;
for (l in ary) { printf "%s%c", ary[l], l<len ? "\n" : 0; }
delete ary; len=0; next }
ary[len++]=$0;
}' ./file.txt )
$: echo "[${array[1]}]"
[Some line of text 9
Some line of text 10
Some line of text 11]
Upvotes: 2
Reputation: 73
Paul's answer does what I want, so I flagged it as the accepted answer. Though his solution produces a blank extra line at the bottom of every stored value in the array, which is ok, it is easy to remove anyway, so I did not mind. But I also posted this same question on another site, and though Paul's answer was good, I found a better solution:
IFS=$'\r' read -d'\r' -a ARR < <(awk '/PATTERN1/,/PATTERN2/ {if($0 ~ /PATTERN2/) printf $0"\r"; else print}' file.txt)
The above does the job, does not produce a blank extra line, and its a one liner.
echo "${ARR[1]}"
echo "${ARR[0]}"
Output:
PATTERN1
Some line of text 9
Some line of text 10
Some line of text 11
PATTERN2
PATTERN1
Some line of text 3
Some line of text 4
Some line of text 5
PATTERN2
Upvotes: 0
Reputation: 10123
An implementation in plain bash
could be something like that:
#!/bin/bash
beginpat='PATTERN1'
endpat='PATTERN2'
array=()
n=-1
inpatterns=
while read -r; do
if [[ ! $inpatterns && $REPLY = $beginpat ]]; then
array[++n]=$REPLY
inpatterns=1
elif [[ $inpatterns ]]; then
array[n]+=$'\n'$REPLY
if [[ $REPLY = $endpat ]]; then
inpatterns=
fi
fi
done
# Report captured lines
for ((i = 0; i <= n; ++i)); do
printf "=== array[%d] ===\n%s\n\n" $i "${array[i]}"
done
Run as ./script < file
. The use of awk
isn't required but the script will work correctly on the awk
output as well.
Upvotes: 3