Reputation: 123
Any help regarding the best approach for taking the difference of two files or in other words compliment of a file ? something in Unix or Shell scripting or some direct python utility?
Let's say: File 1 has below.
File 2 has below text:
It is known that, File 2 is subset of File 1 and output should be: Removal of first occurrence of elements of File 2 from File 1. So the output looks like below:
In other words, output is nothing but compliment of File 2 in File 1. (Ordering is not important)
Upvotes: 0
Views: 57
Reputation: 16081
You can do with python
,
file_1_data = open('file_1.txt').read().split('\n')
file_2_data = open('file_2.txt').read().split('\n')
for data in file_2_data:
if data in file_1_data:
file_1_data.remove(data)
open('file_1.txt','w').write('\n'.join(file_1_data))
Upvotes: 0
Reputation: 23667
comm
is best suited for this task, but needs sorted input
$ comm -23 <(sort file1) <(sort file2)
15
A
A
B
E
F
From man comm
comm - compare two sorted files line by line
-2 suppress column 2 (lines unique to FILE2)
-3 suppress column 3 (lines that appear in both files)
If output needs to be sorted as shown in expected output
$ comm -23 <(sort file1) <(sort file2) | sort -n
A
A
B
E
F
15
Upvotes: 0
Reputation: 195079
awk 'NR==FNR{a[NR]=$0;n=NR;next}
{for(i=1;i<=n;i++)if($0==a[i]){delete a[i];next}print}' file2 file1
will give you:
A
B
E
F
15
A
The codes are straightforward and telling what they do.
Upvotes: 1