Reputation: 17
I am tring to extract text from a multi-line file. For example I need to extract all text from "Section 1.0" to "Section 3.0"
This can be on many lines.
I have code that works, but seems clumsy and slow. Is there a better way to do this? sed? reg expression?
flag="false"
for line in ${textFile};
do
if [ "$line" == "Section 3.0" ]; then
flag="false"
fi
if [ "$flag" == "true" ]; then
temp_var+=$line
fi
if [ "$line" == "Section 1.0" ]; then
flag="true"
fi
done
Upvotes: 0
Views: 340
Reputation: 41446
awk
can also be used:
awk '/Section 3\.0/{f=0} f; /Section 1\.0/{f=1}' file
Upvotes: 0
Reputation: 784888
Using sed you can do:
sed -n '/Section 1\.0/,/Section 3\.0/p' file
EDIT: To ignore start and end patterns use:
sed -n '/Section 1\.0/,/Section 3\.0/{/Section [13]\.0/!p;}' file
awk solution:
awk '/Section 1\.0/{flag=0} flag{print} /Section 3\.0/{flag=1}' file
Upvotes: 3
Reputation: 189307
sed -n '/Section 1\.0/,/Section 3\.0/p' file
will print from file
all lines between a line matching the first regex anywhere in it through the next line matching the second expression. If there are multiple such matches, they will be printed in flip-flop fashion (look for pattern 1, print through pattern 2, look for pattern 1...)
If you want only the first such section, you can quit when you find the end condition:
sed -n '/Section 3\.0/q;/Section 1\.0/,$p' file
This will exclude the line matching the end condition (guessing that's what you actually want). For simplicity, this assumes you have no Section 3.0 before Section 1.0. (Some sed
dialects might require slighly different syntax; the semicolon may have to be changed to a newline, or the script split into two separate -e
arguments.)
Upvotes: 2