Reputation: 13
I'm trying to match a pattern with awk that contains square brackets. The pattern I am trying to match is:
[senderProcess:$PROCESS_ID:val:$ID]
where PROCESS_ID
and ID
are existing shell variables. I have tried defining a pattern variable in my awk statement:
awk -v pattern="[senderProcess:$PROCESS_ID:val:$ID]" '$0 ~ pattern && /GCLInbox run FINE/' $innerfile
When I run this, I get the following error:
awk: cmd. line:1: (FILENAME=logset1/teach-node-06.40490.log FNR=1) fatal: invalid regexp: Invalid range end: /[senderProcess:teach-node-06:40190:val:67]/
I took this as the awk shell interpreting the square brackets as regex special characters, so I tried escaping the brackets:
... pattern="\[senderProcess...$ID\]" ...
This gives the same error, in addition to the following two errors:
awk: warning: escape sequence `\[' treated as plain `['
awk: warning: escape sequence `\]' treated as plain `]'
I have also tried double escaping the brackets, with the same result.
I have tried using single quotes instead of double when declaring pattern, however I get the same errors, and regardless, my shell variables need to be expanded which would not happen here.
I just want to match the given pattern including its square brackets, whether that be by bypassing the regex special characters or some other way. Any help very much appreciated.
Upvotes: 1
Views: 614
Reputation: 2895
somehow when i directly set it as FS
, without escaping, it worked, but only because awk
just treated it as a single character class, and thus all letters inside are valid :
echo '[senderProcess:$PROCESS_ID:val:$ID]' |
mawk -v FS='[senderProcess:$PROCESS_ID:val:$ID]' '!_<NF' gawk -v FS='[senderProcess:$PROCESS_ID:val:$ID]' '!_<NF' nawk -v FS='[senderProcess:$PROCESS_ID:val:$ID]' '!_<NF'
[senderProcess:$PROCESS_ID:val:$ID]
to do it properly :
gawk 'index($-_,__)' __='[senderProcess:$PROCESS_ID:val:$ID]'
[senderProcess:$PROCESS_ID:val:$ID]
to escape everything you need :
mawk -v __='[senderProcess:$PROCESS_ID:val:$ID]' ' BEGIN { _=__ gsub("[[-_!-/:-@{-~]", "[&]",_) gsub("["\\^/]", "\\\\&",_) printf("%s original pattern :\f%s\n after escaping :\f%s%s", ORS = "\n\n",__,_,ORS) > ("/dev/stderr") _*=__=_ } ($_)~__'
original pattern :
[senderProcess:$PROCESS_ID:val:$ID]
after escaping :
[[]senderProcess[:][$]PROCESS[_]ID[:]val[:][$]ID[]]
[senderProcess:$PROCESS_ID:val:$ID
Upvotes: 0
Reputation: 36700
I have also tried double escaping the brackets, with the same result.
You were close, you might get desired result by using \\\
, consider following simple example let file.txt
content be
[]
[1]
[12]
then
awk -v pattern="\\\[.\\\]" '$0 ~ pattern' file.txt
gives output
[1]
(tested in gawk 4.2.1)
Upvotes: 0
Reputation: 84609
Creating the dynamic REGEX, you can include '['
and ']'
within a list [...]
and have each identified as the character instead of the start/end of a list.
I would try something similar to:
awk -v pattern="[[]senderProcess:$PROCESS_ID:val:$ID[]]" 'pattern && /GCLInbox run FINE/' $innerfile
Upvotes: 1
Reputation: 133710
You should make use of index
function of awk
try following code. Setting some test values to shell variables named ID
and PROCESS_ID
though its advised to have shell variable names in small case just going with your samples here. Then create a shell variable named var
which is having concatenation of above mentioned 2 shell variables and then passing var
to awk
program.
ID="test1"
PROCESS_ID="test"
var="[senderProcess:${PROCESS_ID}:val:${ID}]"
awk -v pattern="$var" 'index($0,pattern) && /GCLInbox run FINE/' Input_file
Upvotes: 1