emily
emily

Reputation: 89

Awk for a string containing the special character "."

This seems like an easy question, but I have tried a number of approaches I've found in other questions but have had no luck.

I am simply trying to use awk to look for the string (ExAC_ALL=.) within the 8th column of a txt file, however the speacial character "." seems to be causing issues.

The code I am trying to use is

> awk ' ($8 ~ "ExAC_ALL=.") {print $0}' input.txt > output.txt

I have also tried:

> EXAC="ExAC_ALL=." 
> awk -v NAME="$EXAC" '$8 ~ NAME { print $0 }' input.txt > output.txt

I have also tried escaping the "." symbol multiple ways.

Any suggestions would be greatly appreciated.

Upvotes: 1

Views: 11052

Answers (3)

Sundeep
Sundeep

Reputation: 23677

for fixed string matching, avoid regex and use index - it returns position of match and 0 if no match is found

awk 'index($8, "ExAC_ALL=.")' ip.txt


for passing string from shell, use environment variable instead of -v option, this will prevent backslash interpretations

name="ExAC_ALL=." awk 'index($8, ENVIRON["name"])' ip.txt

for ex:

$ echo 'a\b' | awk -v s='\b' 'index($1, s)'
$ echo 'a\b' | s='\b' awk 'index($1, ENVIRON["s"])'
a\b

Upvotes: 1

Akshay Hegde
Akshay Hegde

Reputation: 16997

You may try like below

$ EXAC="ExAC_ALL=[.]" 
$ awk -v NAME="$EXAC" '$8 ~ NAME { print $0 }'  input.txt > output.txt

Upvotes: 2

John1024
John1024

Reputation: 113924

Just use a single-backslash to escape the period.

For example, consider this input file:

$ cat file
ExAC_ALL=1
ExAC_ALL=.
ExAC_ALL=*

To get the lines you want:

$ awk '$1 ~ /ExAC_ALL=\./' file
ExAC_ALL=.

Discussion

With out the backslash, the period is a wildcard character: it matches any character. Thus:

$ awk '$1 ~ /ExAC_ALL=./' file
ExAC_ALL=1
ExAC_ALL=.
ExAC_ALL=*

With the backslash, it will only match a period.

Alternative

Alternatively, one could put the period in square brackets:

$ awk '$1 ~ /ExAC_ALL=[.]/' file
ExAC_ALL=.

Upvotes: 2

Related Questions