Reputation: 173
I have searched for a similar topic here but most questions included single-character delimiter.
I have this sample of text:
Some text here,
continuing on next lineDELIMITERSecond chunk of text
which may as well continue on next lineDELIMITERFinal chunk
And the desired output is a list (extracted=()
) which contains:
Some text here,
continuing on next line
Second chunk of text
which may as well continue on next line
Final chunk
As could be seen from the sample, "DELIMITER" is used as a splitting delimiter.
I have tried numerous samples on SO incl awk, replacing etc.
Upvotes: 3
Views: 2742
Reputation: 8711
You can try Perl. With -0777 option, perl slurps the entire file into a $_ variable. You can then split the content using the DELIMITER. Check this out.
$ perl -0777 -ne '@x=split("DELIMITER");print join("\n\n",@x) ' hubbs.txt
Some text here,
continuing on next line
Second chunk of text
which may as well continue on next line
Final chunk
$
Adding array positions while printing
$ perl -0777 -ne '@x=split("DELIMITER"); for(@x) { print ++$i,". $_\n" } ' hubbs.txt
1. Some text here,
continuing on next line
2. Second chunk of text
which may as well continue on next line
3. Final chunk
$
Upvotes: 0
Reputation: 1060
I think the most challenge in the question is to handle space, newline, and DELIMITER correctly, and then put all things in an array. It it was to split file only, then it would be too easy. How about this template:
#!/bin/bash
gencode(){
echo -e "extracted=(); read -r -d '' item <<-DELIMITER"
sed 's:DELIMITER:\n&\nextracted+=("$item"); read -r -d "" item <<-&\n:' Input_file;
echo -e "DELIMITER\n"'extracted+=("$item")'
}
gencode|cat -n # for explaination purpose only
eval "`gencode`" # do not remove "eval"
for (( i=0; i < ${#extracted[@]}; i++ )); do # print results
echo "$i: ${extracted[i]}"
done
Outputs
1 extracted=(); read -r -d '' item <<-DELIMITER
2 Some text here,
3 continuing on next line
4 DELIMITER
5 extracted+=("$item"); read -r -d "" item <<-DELIMITER
6 Second chunk of text
7 which may as well continue on next line
8 DELIMITER
9 extracted+=("$item"); read -r -d "" item <<-DELIMITER
10 Final chunk
11 DELIMITER
12 extracted+=("$item")
0: Some text here,
continuing on next line
1: Second chunk of text
which may as well continue on next line
2: Final chunk
Upvotes: 0
Reputation: 31
You can try using arrays.
#!/bin/bash
str="continuing on next lineDELIMITERSecond chunk of text
which may as well continue on next lineDELIMITERFinal chunk";
delimiter=DELIMITER
s=$str$delimiter
array=();
while [[ $s ]]; do
array+=( "${s%%"$delimiter"*}" );
s=${s#*"$delimiter"};
done;
declare -p array
this will split your text into array based on your delimiter the result will be an array of your text.
array=([0]="continuing on next line" [1]=$'Second chunk of text\nwhich may as well continue on next line' [2]="Final chunk")
you can access each line using the array indices or you can print all the lines using printf '%s\n' "${array[@]}"
the results will be
continuing on next line Second chunk of text which may as well continue on next line Final chunk
The solution gives you an opportunity to do a lot with your text.
Upvotes: 1
Reputation: 22012
With AWK please try the following:
awk -v RS='^$' -v FS='DELIMITER' '{
n = split($0, extracted)
for (i=1; i<=n; i++) {
print i". "extracted[i]
}
}' sample.txt
which yields:
1. Some text here,
continuing on next line
2. Second chunk of text
which may as well continue on next line
3. Final chunk
If you require to transfer the awk array to bash array, further step will be needed depending on the succeeding process on the array.
Upvotes: 1
Reputation: 133428
In case you don't want to change default RS
value then could you please try following.
awk '{gsub("DELIMITER",ORS)} 1' Input_file
Upvotes: 5
Reputation: 7215
You can try something like:
awk 'BEGIN {RS="DELIMITER";} {print}' input_file
And then assign it to variable, etc...
Upvotes: 0