e271p314
e271p314

Reputation: 4029

How to run grep inside awk?

Suppose I have a file input.txt with few columns and few rows, the first column is the key, and a directory dir with files which contain some of these keys. I want to find all lines in the files in dir which contain these key words. At first I tried to run the command

cat input.txt | awk '{print $1}' | xargs grep dir

This doesn't work because it thinks the keys are paths on my file system. Next I tried something like

cat input.txt | awk '{system("grep -rn dir $1")}'

But this didn't work either, eventually I have to admit that even this doesn't work

cat input.txt | awk '{system("echo $1")}'

After I tried to use \ to escape the white space and the $ sign, I came here to ask for your advice, any ideas?

Of course I can do something like

for x in `cat input.txt` ; do grep -rn $x dir ; done

This is not good enough, because it takes two commands, but I want only one. This also shows why xargs doesn't work, the parameter is not the last argument

Upvotes: 8

Views: 82177

Answers (6)

Mohit Verma
Mohit Verma

Reputation: 2089

In case you still want to use grep inside awk, make sure $1, $2 etc are outside quote. eg. this works perfectly

cat file_having_query | awk '{system("grep " $1 " file_to_be_greped")}'

// notice the space after grep and before file name

Upvotes: 1

ghoti
ghoti

Reputation: 46826

First thing you should do is research this.

Next ... you don't need to grep inside awk. That's completely redundant. It's like ... stuffing your turkey with .. a turkey.

Awk can process input and do "grep" like things itself, without the need to launch the grep command. But you don't even need to do this. Adapting your first example:

awk '{print $1}' input.txt | xargs -n 1 -I % grep % dir

This uses xargs' -I option to put xargs' input into a different place on the command line it runs. In FreeBSD or OSX, you would use a -J option instead.

But I prefer your for loop idea, converted into a while loop:

while read key junk; do grep -rn "$key" dir ; done < input.txt

Upvotes: 4

Ed Morton
Ed Morton

Reputation: 203284

You don't need grep with awk, and you don't need cat to open files:

awk 'NR==FNR{keys[$1]; next} {for (key in keys) if ($0 ~ key) {print FILENAME, $0; next} }' input.txt dir/*

Nor do you need xargs, or shell loops or anything else - just one simple awk command does it all.

If input.txt is not a file, then tweak the above to:

real_input_generating_command |
awk 'NR==FNR{keys[$1]; next} {for (key in keys) if ($0 ~ key) {print FILENAME, $0; next} }' - dir/*

All it's doing is creating an array of keys from the first file (or input stream) and then looking for each key from that array in every file in the dir directory.

Upvotes: 28

ArturFH
ArturFH

Reputation: 1787

grep requires parameters in order: [what to search] [where to search]. You need to merge keys received from awk and pass them to grep using the \| regexp operator. For example:

arturcz@szczaw:/tmp/s$ cat words.txt 
foo
bar
fubar
foobaz
arturcz@szczaw:/tmp/s$ grep 'foo\|baz' words.txt 
foo
foobaz

Finally, you will finish with:

grep `commands|to|prepare|a|keywords|list` directory

Upvotes: 1

jkshah
jkshah

Reputation: 11703

Try following

awk '{print $1}' input.txt | xargs -n 1 -I pattern grep -rn pattern dir

Upvotes: 7

chepner
chepner

Reputation: 531035

Use process substitution to create a keyword "file" that you can pass to grep via the -f option:

grep -f <(awk '{print $1}' input.txt) dir/*

This will search each file in dir for lines containing keywords printed by the awk command. It's equivalent to

awk '{print $1}' input.txt > tmp.txt
grep -f tmp.txt dir/*

Upvotes: 2

Related Questions