Reputation: 290105
I am using GNU Awk 4.1.3. I want to process this file:
$$$$
1
1
$$$$
2
2
$$$$
3
3
$$$$
1
clave
2
$$$$
5
5
$$$$
And print the block of lines that go between "$$$$" and the next "$$$$" when that given block contains the text "clave" in it. That is, with the given example I want this output:
1
clave
2
My solution is to set the record separator RS
to the string "$$$$". Since it is a special character, I need to escape it, so it ends up being like RS='\\$\\$\\$\\$'
:
awk -v RS='\\$\\$\\$\\$' '/clave/' file
The problem with this is that the result contains a new line before and after the block:
$ awk -v RS='\\$\\$\\$\\$' '/clave/' file
1
clave
2
This is because there is a new line between the end of "$$$$" and "1", and there is also a new line between "2" and the next "$$$$".
To avoid this, I am adding the new line on both ends of the record separator, so it becomes RS='\n\$\$\$\$\n'
. It works well:
$ awk -v RS='\n\\$\\$\\$\\$\n' '/clave/' file
# ^^^ ^^
1
clave
2
However, this becomes quite complex and I am wondering if including the new line in the record separator may have some side effects that I am not aware of.
For this, I wonder: how can I set the record separator so it encompasses the new lines? Is my approach valid or should I go for other options because my approach has some drawbacks?
Upvotes: 4
Views: 281
Reputation: 785631
You are getting a newline before and after because there is a new line before and after $$$$
in your file and by setting RS
to $$$$
you are leaving those line breaks in record.
Change your RS
to include a newline or start before and a newline or end afterwards, so that a record will be without those line breaks:
awk -v RS='(^|\n)\\${4}(\n|$)' '/clave/' fike
1
clave
2
Also note that you can use fix length quantifier \\${4}
instead of \\$\\$\\$\\$
.
Upvotes: 2
Reputation: 204164
You should be matching on the newline before and after the 4 $
s as THAT is the real separator (a string of 4 $
s on a line of it's own), anything else could fail if 4 $
s appeared in your data. The first sting of $
s won't have a newline before it of course, it'll match the start-of-string indicator (^
) instead, so you need to use:
$ awk -v RS='(^|\n)[$]{4}\n' '/clave/' file
1
clave
2
I find [$]
easier to read than \\$
, YMMV.
Upvotes: 3