Linguist
Linguist

Reputation: 123

Compliment or Removal of elements of text File 2 in another text File 1

Any help regarding the best approach for taking the difference of two files or in other words compliment of a file ? something in Unix or Shell scripting or some direct python utility?

Let's say: File 1 has below.

File 2 has below text:

It is known that, File 2 is subset of File 1 and output should be: Removal of first occurrence of elements of File 2 from File 1. So the output looks like below:

In other words, output is nothing but compliment of File 2 in File 1. (Ordering is not important)

Upvotes: 0

Views: 57

Answers (3)

Rahul K P
Rahul K P

Reputation: 16081

You can do with python,

file_1_data = open('file_1.txt').read().split('\n')   
file_2_data = open('file_2.txt').read().split('\n')

for data in file_2_data:
    if data in file_1_data:
        file_1_data.remove(data)

open('file_1.txt','w').write('\n'.join(file_1_data))

Upvotes: 0

Sundeep
Sundeep

Reputation: 23667

comm is best suited for this task, but needs sorted input

$ comm -23 <(sort file1) <(sort file2)
15
A
A
B
E
F

From man comm

   comm - compare two sorted files line by line

   -2     suppress column 2 (lines unique to FILE2)

   -3     suppress column 3 (lines that appear in both files)


If output needs to be sorted as shown in expected output

$ comm -23 <(sort file1) <(sort file2) | sort -n
A
A
B
E
F
15

Upvotes: 0

Kent
Kent

Reputation: 195079

awk 'NR==FNR{a[NR]=$0;n=NR;next}
    {for(i=1;i<=n;i++)if($0==a[i]){delete a[i];next}print}' file2 file1

will give you:

A
B
E
F
15
A

The codes are straightforward and telling what they do.

Upvotes: 1

Related Questions