Reputation: 35
I have a file with the following content. I am trying to extract the block with matching start and end patterns, in between I want to exclude the block which has a non-matching numeric id ( maybe a pattern ). Here other than [001] has to be excluded. 002 may not be known. So, I want the blocks only matching with [001].
File contains,
text [001] start
line 1
line 2
text [002] mid start
line 3
line 4
text [002] mid end
line 5
line 6
text [001] end
I need the block, with excluding nonmatching numeric id [002]'s block.
text [001] start
line 1
line 2
line 5
line 6
text [001] end
I couldn't get a clear clarification on the internet for this problem. Can anyone help with this, awk or sed solution?
To get the block with start and end pattern, I am trying with
awk '/[001]/ && /start/, /001/ && /end/' File
Upvotes: 1
Views: 278
Reputation: 58558
This might work for you (GNU sed):
sed -n '/\[001\]/,/\[001\]/{/\[002\]/,/\[002\]/!p}' file
Print only lines between [001]
delimiters and exclude those lines between [002]
delimiters.
Upvotes: 0
Reputation: 204548
Assuming your blocks are nested to any depth and just never overlapping:
$ cat tst.awk
BEGIN { tgtId="001" }
match($0,/\[[0-9]+\]/) {
id = substr($0,RSTART+1,RLENGTH-2)
state = $NF
}
state == "start" { isTgtBlock[++depth] = (id == tgtId ? 1 : 0) }
isTgtBlock[depth] { print }
state == "end" { --depth }
{ id = state = "" }
$ awk -f tst.awk file
text [001] start
line 1
line 2
line 5
line 6
text [001] end
Upvotes: 1
Reputation: 242323
Use sed or Perl:
sed '/001.*start/,/001.*end/!d;/002.*start/,/002.*end/d'
perl -ne 'print if /001.*start/ .. /001.*end/
and not /002.*start/ .. /002.*end/'
Using look-ahead assertions can make the excluded tag dynamic easily:
perl -ne 'print if /001.*start/ .. /001.*end/
and not /text \[(?!001).*start/ .. /text \[(?!001).*end/'
Upvotes: 1
Reputation: 41460
This awk
may do. You may need to tweak trigger to work for your data:
awk '/\[001\] start/{f=1} /\[002\] .* start/{f=0} f; /\[001\] end/{f=0} /\[002\] .* end/{f=1}' file
text [001] start
line 1
line 2
line 5
line 6
text [001] end
More readable
awk '
/\[001\].*start/ {f=1}
/\[002\].*start/ {f=0}
f;
/\[001\].*end/ {f=0}
/\[002\].*end/ {f=1}
' file
Just change trigger code to reflect true data.
Upvotes: 1
Reputation: 26571
Assume we make use of the variables b1
if we are in block 1 and b2
if we are in block 2:
awk '/001/ && /start/ { b1=1 }
/002/ && /start/ { b2=1 }
(b1 && !b2)
/002/ && /end/ { b2=0 }
/001/ && /end/ { b1=0 }' file
Range expressions are handy, but to quote Ed Morton: Never use range expressions (e.g. /start/,/end/
) as they make trivial tasks very slightly briefer but then require duplicate conditions or a complete rewrite for the tiniest requirements change.
Upvotes: 1