Todd Iglehart
Todd Iglehart

Reputation: 17

How do I extract text from a file in a bash script

I am tring to extract text from a multi-line file. For example I need to extract all text from "Section 1.0" to "Section 3.0"

This can be on many lines.

I have code that works, but seems clumsy and slow. Is there a better way to do this? sed? reg expression?

flag="false"

for line in ${textFile}; 
do
   if [ "$line" == "Section 3.0" ]; then
      flag="false"
   fi
   if [ "$flag" == "true" ]; then
      temp_var+=$line
   fi
   if [ "$line" == "Section 1.0" ]; then
      flag="true"
   fi
done

Upvotes: 0

Views: 340

Answers (3)

Jotne
Jotne

Reputation: 41446

awk can also be used:

awk '/Section 3\.0/{f=0} f; /Section 1\.0/{f=1}' file

Upvotes: 0

anubhava
anubhava

Reputation: 784888

Using sed you can do:

sed -n '/Section 1\.0/,/Section 3\.0/p' file

EDIT: To ignore start and end patterns use:

sed -n '/Section 1\.0/,/Section 3\.0/{/Section [13]\.0/!p;}' file

awk solution:

awk '/Section 1\.0/{flag=0} flag{print} /Section 3\.0/{flag=1}' file

Upvotes: 3

tripleee
tripleee

Reputation: 189307

sed -n '/Section 1\.0/,/Section 3\.0/p' file

will print from file all lines between a line matching the first regex anywhere in it through the next line matching the second expression. If there are multiple such matches, they will be printed in flip-flop fashion (look for pattern 1, print through pattern 2, look for pattern 1...)

If you want only the first such section, you can quit when you find the end condition:

sed -n '/Section 3\.0/q;/Section 1\.0/,$p' file

This will exclude the line matching the end condition (guessing that's what you actually want). For simplicity, this assumes you have no Section 3.0 before Section 1.0. (Some sed dialects might require slighly different syntax; the semicolon may have to be changed to a newline, or the script split into two separate -e arguments.)

Upvotes: 2

Related Questions