Reputation: 379
I have the test file like this
fdsf fdsf fdsfds fdsf
fdsfdsfsdf fdsfsf
fsdfsdf var12=1343243432
fdsf fdsf fdsfds fdsf
fdsfsdfdsfsdf
fsdfsdf var12=13432434432
fdsf fdsf fdsfds fdsf
fsdfsdf fdsfsf var12=13443432432
Now i want to use var12=\d+
as the record separator. Is this possible in awk
Upvotes: 8
Views: 7170
Reputation: 135
It's 11 years later and the POSIX 2024 awk spec still has RS only using one character (or none for multiline records). "If RS contains more than one character, the results are unspecified." But nearly all awk implementations now accept a regular expression for RS, including gawk, mawk, goawk, busybox awk, toybox awk, and Kernighan's One True Awk (aka nawk).
Upvotes: 0
Reputation: 141898
Assuming GNU awk (a.k.a. gawk
) on Linux, yes.
RS
This is awk's input record separator. Its default value is a string containing a single newline character, which means that an input record consists of a single line of text. It can also be the null string, in which case records are separated by runs of blank lines. If it is a regexp, records are separated by matches of the regexp in the input text.
Source: 7.5.1 Built-in Variables That Control awk
, The GNU Awk User's Guide.
As @steve says, \d
is not in the list of Regular Expression Operators or gawk
-Specific Regexp Operators, so you need to use a bracket expression such as [0-9]
or [[:digit:]]
in place of your \d
.
However, it's not clear from your question as to what your intention here is. I've answered your question but I doubt I've solved your underlying problem. See also What is the XY problem?
Upvotes: 4
Reputation: 54532
Yes, however you should use [0-9]
instead of \d
:
awk '1' RS="var12=[0-9]+" file
IIRC, only GNU awk
can use multi-character record separators.
Results:
fdsf fdsf fdsfds fdsf
fdsfdsfsdf fdsfsf
fsdfsdf
fdsf fdsf fdsfds fdsf
fdsfsdfdsfsdf
fsdfsdf
fdsf fdsf fdsfds fdsf
fsdfsdf fdsfsf
Please post your desired output if you need further assistance.
Upvotes: 9