Yishu Fang
Yishu Fang

Reputation: 9958

How can I find which lines in a certain file are not started by lines from another file using bash?

I have two text files, A and B:

A:

a start
b stop
c start
e start

B:

b
c

How can I find which lines in A are not started by lines from B using shell(bash...) command. In this case, I want to get this answer:

a start
e start

Can I implement this using a single line of command?

Upvotes: 1

Views: 55

Answers (3)

gniourf_gniourf
gniourf_gniourf

Reputation: 46823

This should do:

sed '/^$/d;s/^/^/' B | grep -vf - A

The sed command will take all non-empty lines (observe the /^$/d command) from the file B and prepend a caret ^ in front of each line (so as to obtain an anchor for grep's regexp), and spits all this to stdout. Then grep, with the -f option (which means take all patterns from a file, which happens to be stdin here, thanks to the - symbol) and does an invert matching (thanks to the -v option) on file A. Done.

Upvotes: 3

foobarfuzzbizz
foobarfuzzbizz

Reputation: 58637

You can try using a combination of xargs, cat, and grep

Save the first letters of each line into FIRSTLETTERLIST. You can do this with some cat and sed work.

The idea is to take the blacklist and then match it against the interesting file.

cat file1.txt | xargs grep ^[^[$FIRSTLETTERLIST]]

This is untested, so I won't guarantee it will work, but it should point you in the right direction.

Upvotes: 0

sampson-chen
sampson-chen

Reputation: 47267

I think this should do it:

sed 's/^/\^/g' B > C.tmp
grep -vEf C.tmp A
rm C.tmp

Upvotes: 1

Related Questions