eth0
eth0

Reputation: 1

grep inverse read match pattern from two files

I have a file (note that some lines have more than 2 columns, also some lines are 1 space delimited, and some are multiple space delimited, this file is quite large...)

 file1.txt:
there is a line here that has more than two columns
## this line is a comment
blahblah:     blahblahSierraexample7272
foo: [email protected]
nonsense:                    nonsense59s59S
nonsense:   someRandomColumn
.....

I have another file that is a subset of file1.txt, this file has two columns only and columns are "1" space delimited!

file2.txt
foo: [email protected]
nonsense: nonsense59s59S

now, I would like to delete all lines that appear in file2.txt from file1.txt, how can I do that in a shell script? note that the second file (file2.txt) has two columns only, while file1.txt has multiple... so if a matching needs to be done it should be like: $1(from file2) match $1(from file1) and $NF(from file2) match $NF(from file1) and then inverse the match and print...

P.S. already tried grep -vf file2.txt file1.txt but since the space between column1 and $NF is not fixed it didn't work... sed and awk should do the trick but can't come up with the code...

sed -i '/^<firstColumnOfFile2> .* <lastColumnOfFile2>$/d' file1.txt (perhaps in a while loop!)

or something like: grep -vw -f ^[(1stColofFile2)] and also [(lastColOfFile2)]$ file1.txt

Upvotes: 0

Views: 413

Answers (2)

Ed Morton
Ed Morton

Reputation: 204055

$ awk 'NR==FNR{a[$0]; next} {orig=$0; $1=$1} !($0 in a){print orig}' file2 file1
there is a line here that has more than two columns
## this line is a comment
blahblah:     blahblahSierraexample7272
foo: [email protected]
nonsense:   someRandomColumn
.....

Upvotes: 0

Shawn
Shawn

Reputation: 52539

You can use sed to turn the lines in file2.txt into regular expressions that match one or more spaces after the colon, and then use grep to remove the lines from file1.txt that match those:

$ grep -Evf <(sed 's/^\([^:]*\): /^\1:[[:space:]]+/' file2.txt) file1.txt
there is a line here that has more than two columns
## this line is a comment
blahblah:     blahblahSierraexample7272
foo: [email protected]
nonsense:   someRandomColumn

Upvotes: 0

Related Questions