Reputation: 799
not sure how to ask this question but an example would surely clarify. Suppose I have this file:
$ cat intoThat
a b
a h
a l
a m
b c
b d
b m
c b
c d
c f
c g
c p
d h
d f
d p
and this list:
cat grepThis
a
b
c
d
now I would like to grepThis intoThat and I would do this:
$grep -wf grepThis intoThat
which will give an output like this:
**a b**
a h
a l
a m
**b c**
**b d**
b m
**c b**
**c d**
c f
c g
c p
d h
d f
d p
now the asterisks are used to highlight those lines that I would like grep to return. These are the lines that have a full match but...how to tell grep (or awk or whatever) to get only these lines? Of course it is possible that some lines do not match any pattern, e.g. in the intoThat file I may have some other letters like g, h, l, s, t, etc...
Upvotes: 0
Views: 287
Reputation: 2761
With awk
, you could do:
awk 'NR==FNR{ seen[$0]++; next } ($1 in seen && $2 in seen)' grepThis intoThat
a b
b c
b d
c b
c d
NR
is set to 1 when the first record read by awk and incrementing for each next records reading either in single or multiple input files until all records/line read.FNR
is set to 1 when the first record read by awk and incrementing for each next records reading in current file and reset back to 1 for the next input file if multiple input files.so NR == FNR
is always a true condition for first input file and the block followed by this will perform actions on the first file only.
The seen
is an associated awk
array named seen
(you can use different name as you want) with the key of whole line $0
and value with occurrences of each line occurred (this way usually is using to remove duplicated records in awk
too).
The next
token skips to executing rest of the commands and those will only execute actually for next file(s) except first.
In next (....)
, we are just checking if both column$1 and $2 are present in the array, if so they will goes in output.
Upvotes: 3