Slava.In
Slava.In

Reputation: 1059

Awk how to find a match of variable with paranthesis?

I have a file some_file.txt , in which I want to find a match line by name inside square brackets (it must bee exact match as some words may be repeated - like foo in example below). The document contents looks like this:

[foo](url)
[foo Foo](url)
[bar (Bar)](url)
[fizz buzz](url)

I came up with the following, but it breaks when you specify name with paranthesis.

file="some_file.txt"

name="foo" # Good
name="foo Foo" # Good
name="bar (Bar)" # No match :(
name="fizz buzz" # Good

matched_line=$(awk -v n="${name}]" '$0 ~ n {print NR}' "${file}")

I tried to escape paranthesis like so name="bar \(Bar\)", but it doesn't help.

Upvotes: 2

Views: 184

Answers (3)

RARE Kpop Manifesto
RARE Kpop Manifesto

Reputation: 2855

This is an absolutely amazing textbook case of inconsistent treatment among awks -

  • half of them attempted to process it as a regex for no reason even though it was never requested to perform the role of a regex string constant anywhere within the code itself. seems only mawks consistently treat them literally (the correct approach) :
  mawk  -v __='\(something\)' 'BEGIN { print __,"mawk-1" } '; 
  mawk2 -v __='\(something\)' 'BEGIN { print __,"mawk-2" } '; 
  nawk  -v __='\(something\)' 'BEGIN { print __,"nawk" }'; 
  gawk  -v __='\(something\)' 'BEGIN { print __, "gawk-one-slash" } ';  
  gawk -v __='\\(something\\)' 'BEGIN { print __,"gawk-dbl-slash" } ';
  gawk 'END { print __,   "gawk-END-no-esc " }'  __='(something)' <<<"";  
  gawk -v __='\\(something\\)' -ce 'BEGIN { print __ , "gawk-ce-dbl-slsh" } ' 

|

     1  \(something\) mawk-1
     2  \(something\) mawk-2
     3  (something) nawk
gawk: warning: escape sequence `\(' treated as plain `('
gawk: warning: escape sequence `\)' treated as plain `)'
     4  (something) gawk-one-slash
     5  \(something\) gawk-dbl-slash
     6  (something) gawk-END-no-esc 
     7  \(something\) gawk-ce-dbl-slsh

Upvotes: 2

Daweo
Daweo

Reputation: 36620

I tried to escape paranthesis like so name="bar \(Bar\)", but it doesn't help.

I run simple test using GNU AWK 4.2.1

awk -v n="\(something\)" 'BEGIN{print n}'

and got

awk: warning: escape sequence `\(' treated as plain `('
awk: warning: escape sequence `\)' treated as plain `)'
(something)

This is reason, but how to get desired result? Stuff more escape characters, after test I found \\\ is required, that is

awk -v n="\\\(something\\\)" 'BEGIN{print n}'

gives

\(something\)

Final test: check if it could be used for string matching, let file.txt content be

foo bar
foo (bar)
(foo) bar
(foo) (bar)

then

awk -v n="foo \\\(bar\\\)" '$0 ~ n' file.txt

gives output

foo (bar)

Upvotes: 3

anubhava
anubhava

Reputation: 785481

Use a non-regex search using index function:

awk -v n='bar (Bar)' 'index($0, n) {print NR}' file
3

# be more precise and search with surrounding [ .. ]
awk -v n='bar (Bar)' 'index($0, "[" n "]") {print NR}' file
3

index function preforms plain text search in awk hence it doesn't require any escaping of special characters.

Using all search terms:

for name in 'foo' 'foo Foo' 'bar (Bar)' 'fizz buzz'; do
   awk -v n="$name" 'index($0, "[" n "]") {print NR, n}' file
done

1 foo
2 foo Foo
3 bar (Bar)
4 fizz buzz

Upvotes: 7

Related Questions