Reputation: 5803
Let's say I have a file with multiple fields and field 1 needs to be filtered for 2 conditions. I was thinking of turning those conditions into a regex pattern and pass them as variables to the awk statement. For some reason, they are not filtering out the records at all. Here is my attempt that runs fine, but doesn't filter out the results per conditions, except when fed directly into awk without variable assignment.
regex1="/abc|def/"; # match first field for abc or def;
regex2="/123|567/"; # and also match the first field for 123 or 567;
cat file_name \
| awk -v pat1="${regex1}" -v pat2="${regex2}" 'BEGIN{FS=OFS="\t"} {if ( ($1~pat1) && ($1~pat2) ) print $0}'
Update: Fixed a syntax error related to missing parenthesis for the if conditions in the awk. (I had it fixed in the code I ran).
Sample data
abc:567 1
egf:888 2
Expected output
abc:567 1
The problem is that I am getting all the results instead of the ones that satisfy the 2 regex for field 1
Note that the match needs to be wildcarded instead of exact match. Meaning 567 as defined in the regex pattern should also match on 567_1 if available.
Upvotes: 0
Views: 544
Reputation: 203493
It seems like the way to implement what you want to do would be:
awk -F'\t' '
($1 ~ /abc|def/) &&
($1 ~ /123|567/)
' file
or probably more robustly:
awk -F'\t' '
{ split($1,a,/:/) }
(a[1] ~ /abc|def/) &&
(a[2] ~ /123|567/)
' file
What's wrong with that?
EDIT here is me running the OPs code before and after fixing the inclusion of regexp delimiters (/
) in the dynamic regexp strings:
$ cat tst.sh
#!/usr/bin/env bash
regex1="/abc|def/"; #--match first field for abc or def;
regex2="/123|567/"; #--and also match the first field for 123 or 567;
cat file_name \
| awk -v pat1="${regex1}" -v pat2="${regex2}" 'BEGIN{FS=OFS="\t"} $1~pat1 && $1~pat2'
echo "###################"
regex1="abc|def"; #--match first field for abc or def;
regex2="123|567"; #--and also match the first field for 123 or 567;
cat file_name \
| awk -v pat1="${regex1}" -v pat2="${regex2}" 'BEGIN{FS=OFS="\t"} $1~pat1 && $1~pat2'
$
$ ./tst.sh
###################
abc:567 1
Upvotes: 3
Reputation: 133508
EDIT: Since OP has changed the samples, so adding this solution here, this will work for partial matches also, again written and tested with shown samples in GNU awk
.
awk -F':|[[:space:]]+' -v var1="abc|def" -v var2="123|567" '
BEGIN{
num=split(var1,arr1,"|")
split(var2,arr2,"|")
for(i=1;i<=num;i++){
reg1[arr1[i]]
reg2[arr2[i]]
}
}
{
for(i in reg1){
if(index($1,i)){
for(j in reg2){
if(index($2,j)){ print; next }
}
}
}
}
' Input_file
Let's say following is an Input_file:
cat Input_file
abc_2:567_3 1
egf:888 2
Now after running above code we will get abc_2:567_3 1
in output.
With your shown samples only, could you please try following. Written and tested in GNU awk
. Give your values which you you want to look for in 1st column in var1
and those which you want to look in 2nd field in var2
variables respectively with pipe delimiter in it.
awk -F':|[[:space:]]+' -v var1="abc|def" -v var2="123|567" '
BEGIN{
num=split(var1,arr1,"|")
split(var2,arr2,"|")
for(i=1;i<=num;i++){
reg1[arr1[i]]
reg2[arr2[i]]
}
}
($1 in reg1) && ($2 in reg2)
' Input_file
Explanation: Adding detailed explanation for above.
awk -F':|[[:space:]]+' -v var1="abc|def" -v var2="123|567" ' ##Starting awk program from here.
##Setting field separator as colon or spaces, setting var1 and var2 values here.
BEGIN{ ##Starting BEGIN section of this program from here.
num=split(var1,arr1,"|") ##Splitting var1 to arr1 here.
split(var2,arr2,"|") ##Splitting var2 to arr2 here.
for(i=1;i<=num;i++){ ##Running for loop from 1 to till value of num here.
reg1[arr1[i]] ##Creating reg1 with index of arr1 value here.
reg2[arr2[i]] ##Creating reg1 with index of arr2 value here.
}
}
($1 in reg1) && ($2 in reg2) ##Checking condition if 1st field is present in reg1 AND in reg2 then print that line.
' Input_file ##Mentioning Input_file name here.
Upvotes: 1