Girrafish
Girrafish

Reputation: 2482

Bash - Disable regex in awk statement

I have a text file like so:

tets v1.0
psutil==4.1.0
tclclean==2.4.3

test v2.0
psutil==3.1.1
pyYAML==3.11

not_test
psutil==4.1.0
tclclean==2.8.0

and i'm using awk and the user's input to find the text under the first line of a specific block. The command I use is (where user_in is the user's input)...

awk -v ORS='\n\n' -v RS= -v FS='\n' "\$1 ~ \"^$user_in$\"" myfile.txt

The problem is that if the user inputs ".*", the awk statement is going to take it as a regex and give me all three blocks, but I don't want anything to be outputed since it doesn't match any of the first lines literally.

What I'm trying to say is, is there a way to disable regex in awk and take every char in the literal way (in the same manner as fgrep)?

Upvotes: 1

Views: 588

Answers (2)

Ed Morton
Ed Morton

Reputation: 204310

Read the book Effective Awk Programming, 4th Edition, by Arnold Robbins.

Now let's clean up your script:

awk -v ORS='\n\n' -v RS= -v FS='\n' "\$1 ~ \"^$user_in$\"" myfile.txt

Don't enclose any script for any tool in double quotes, always use single quotes so you don't end up in backslash-escaping hell. So the above becomes:

awk -v ORS='\n\n' -v RS= -v FS='\n' -v user_in="$user_in" '$1 ~ "^"user_in"$"' myfile.txt

And if you want to test for a string then just test for a string, not a regexp, e.g. to find records where $1 STARTS WITH your target string:

awk -v ORS='\n\n' -v RS= -v FS='\n' -v user_in="$user_in" 'index($1,user_in)==1' myfile.txt

or CONTAINS your target string:

awk -v ORS='\n\n' -v RS= -v FS='\n' -v user_in="$user_in" 'index($1,user_in)>=1' myfile.txt

or ENDS WITH your target string:

awk -v ORS='\n\n' -v RS= -v FS='\n' -v user_in="$user_in" 'index($1,user_in)==(length($1)-length(user_in))' myfile.txt

or if you want to find cases where $1 IS the target string instead of just starting with it (as your script was attempting), it's even simpler:

awk -v ORS='\n\n' -v RS= -v FS='\n' -v user_in="$user_in" '$1 == user_in' myfile.txt

Upvotes: 3

janos
janos

Reputation: 124734

~ is the regular expression operator. If you don't want to use regular expressions, then use == and don't wrap your input in ^...$, like this:

awk -v ORS='\n\n' -v RS= -v FS='\n' "\$1 == \"$user_in\"" myfile.txt

This is still not quite safe enough, because for example if user_in contains " the command will not work. It will be better to pass it in as a user_in variable for awk:

awk -v ORS='\n\n' -v RS= -v FS='\n' -v user_in="$user_in" '$1 == user_in'

Upvotes: 2

Related Questions