branquito
branquito

Reputation: 4044

searching multi-word patterns from one file in another using awk

patterns file:

wicked liquid
movie
guitar
balance transfer offer
drive car

bigfile file:

wickedliquidbrains
drivelicense
balanceofferings

using awk on command line:

awk '/balance/ && /offer/' bigfile

i get the result i want which is

balanceofferings

awk '/wicked/ && /liquid/' bigfile  

gives me

wickedliquidbrains, which is also good..


awk '/drive/ && /car/' bigfile

does not give me drivelicense which is also good, as i am having &&

now when trying to pass shell variable, containg those '/regex1/ && /regex2/.. etc' to awk..

awk -v search="$out" '$0 ~ search' "$bigfile"

awk does not run.. what may be the problem??

Upvotes: 0

Views: 2013

Answers (2)

TrueY
TrueY

Reputation: 7610

UPDATED

An alternative to Barmars's solution with arguments passed with -v:

awk -v search="$out" 'match($0,search)' "$bigfile"

Test:

$ echo -e "one\ntwo"|awk -v luk=one 'match($0,luk)'
one

Passing two (real) regexs (EREs) to :

echo -e "one\ntwo\nnone"|awk -v re1=^o -v re2=e$ 'match($0,re1) && match($0,re2)'

Output:

one

If You want to read the pattern_file and do match against all the rows, You could try something like this:

awk 'NR==FNR{N=NR;re[N,0]=split($0,a);for(i in a)re[N,i]=a[i];next}
{
  for(i=1;i<=N;++i) {
    #for(j=1;j<=re[i,0]&&match($0,re[i,j]);++j);
    for(j=1;j<=re[i,0]&&$0~re[i,j];++j);
    if(j>re[i,0]){print;break}
  }
}' patterns_file bigfile

Output:

wickedliquidbrains

At the 1st line it reads and stores the pattern_file in a 2D array re. Each row contains the split input string. The 0th element of each row is the length of that row. Then it reads bigfile. Each lines of bigfile are tested for match of re array. If all items in a row are matching then that row is printed.

Upvotes: 1

Barmar
Barmar

Reputation: 780843

Try this:

awk "$out" "$bigfile"

When you do $0 ~ search, the value of search has to be a regular expression. But you were setting it to a string containing a bunch of regexps with && between them -- that's not a valid regexp.

To perform an action on the lines that match, do:

awk "$out"' { /* do stuff */ }' "$bigfile"

I switched from double quotes to single quotes for the action in case the action uses awk variables with $.

Upvotes: 2

Related Questions