Liu Will
Liu Will

Reputation: 79

print variable lines between 2 matched characters

cat massive_data.txt
Will
12
123
1234
12345
/>

Liu
23
34
/>

Will
1234
12345
/>

Will
1234
12345
.
.
.

In the above text, I want fetch the line between every "Will" and "/>", ignore the others.But the number of the gap line is variable, I used the below command but got inaccurate result

sed -n '/\<Sector/,/\/\>/p' massive_data.txt
Will
12
123
1234
12345
/>

Will
1234
12345
/>

Will
1234
12345
.
.
.

How can I use "sed" or "awk" to solve the problem? I expect result is as below:

Will
12
123
1234
12345
/>

Will
1234
12345
/>
.
.
.

Upvotes: 3

Views: 49

Answers (3)

anubhava
anubhava

Reputation: 785761

You can use awk like this:

awk '$1 == "Will"{p=1} p{data = data $0 RS} $1 == "/>"{print data; p=0; data=""}' file

Will
12
123
1234
12345
/>


Will
1234
12345
/>

Explanation:

  • $1 == "Will"{p=1}: Set flag p=1 when first column is "Will"
  • p{data = data $0 RS}: If p==1 then keep appending each line into a variable data
  • $1 == "/>"{print data; p=0; data="": If first column is /> then print data and reset p and data variables.

If there is a blank line after /> then you can use awk like this also:

awk -v ORS='\n\n' -v RS= '/^Will/ && /\/>$/' file

Upvotes: 2

James Brown
James Brown

Reputation: 37464

$ awk 'BEGIN{RS=""}/^Will/&&/\/>/' file
Will
12
123
1234
12345
/>
Will
1234
12345
/>

Empty RS splits record at an empty line. Script prints records that start with Will and end in />.

Upvotes: 1

mayank arora
mayank arora

Reputation: 117

based on what I understood :

cat filename | sed -n '/Will/,/>/p' | grep -v "/>"

Output :

Will
12
123
1234
12345
Will
1234
12345
Will
1234
12345

Upvotes: 0

Related Questions