Reputation: 79
cat massive_data.txt
Will
12
123
1234
12345
/>
Liu
23
34
/>
Will
1234
12345
/>
Will
1234
12345
.
.
.
In the above text, I want fetch the line between every "Will" and "/>", ignore the others.But the number of the gap line is variable, I used the below command but got inaccurate result
sed -n '/\<Sector/,/\/\>/p' massive_data.txt
Will
12
123
1234
12345
/>
Will
1234
12345
/>
Will
1234
12345
.
.
.
How can I use "sed" or "awk" to solve the problem? I expect result is as below:
Will
12
123
1234
12345
/>
Will
1234
12345
/>
.
.
.
Upvotes: 3
Views: 49
Reputation: 785761
You can use awk
like this:
awk '$1 == "Will"{p=1} p{data = data $0 RS} $1 == "/>"{print data; p=0; data=""}' file
Will
12
123
1234
12345
/>
Will
1234
12345
/>
Explanation:
$1 == "Will"{p=1}
: Set flag p=1
when first column is "Will"p{data = data $0 RS}
: If p==1
then keep appending each line into a variable data
$1 == "/>"{print data; p=0; data=""
: If first column is />
then print data
and reset p
and data
variables.If there is a blank line after />
then you can use awk
like this also:
awk -v ORS='\n\n' -v RS= '/^Will/ && /\/>$/' file
Upvotes: 2
Reputation: 37464
$ awk 'BEGIN{RS=""}/^Will/&&/\/>/' file
Will
12
123
1234
12345
/>
Will
1234
12345
/>
Empty RS
splits record at an empty line. Script prints records that start with Will
and end in />
.
Upvotes: 1
Reputation: 117
based on what I understood :
cat filename | sed -n '/Will/,/>/p' | grep -v "/>"
Output :
Will
12
123
1234
12345
Will
1234
12345
Will
1234
12345
Upvotes: 0