Reputation: 15
I'm running a program to remove the duplicate lines by comparing two text file using batch.
This is for my personal use to make my work faster by removing duplicate lines from two text files.
I am using the below code,
copy textfile1.txt output.txt >nul
findstr /lvxig:textfile1.txt textfile2.txt >>output.txt
textfile1.txt
contains,
apple
orange
mango
textfile2.txt
contains,
apple
mango
grapes
I expect the output in output.txt
is,
orange
grapes
But the output am getting in output.txt
is
apple
orange
mango
grapes
I don't want to merge two text files. I want to remove the similar duplicate by comparing those two text files.
Upvotes: 2
Views: 1880
Reputation: 334
Try this:
cat textfile1.txt textfile2.txt | grep -Fvxf <(comm -12 <(sort -u textfile1.txt) <(sort -u textfile2.txt))
explanation of this code:
cat: read data from files
comm -12 <(sort -u textfile1.txt) <(sort -u textfile2.txt): shows only duplicate lines in the two files
grep -Fvxf: remove duplicate lines resulting from comm 12
So:
textfile1.txt:
apple
orange
mango
textfile2.txt:
apple
mango
grapes
out:
orange
grapes
as the user who asked the question wants it.
Upvotes: 0
Reputation: 34989
What about this approach:
findstr /LVXIG:"textfile2.txt" "textfile1.txt" > "output.txt"
findstr /LVXIG:"textfile1.txt" "textfile2.txt" >>"output.txt"
Or with common redirection:
(
findstr /LVXIG:"textfile2.txt" "textfile1.txt"
findstr /LVXIG:"textfile1.txt" "textfile2.txt"
) > "output.txt"
Using your example data, the first findstr
command line returns:
orange
And the second one outputs:
grapes
Upvotes: 2
Reputation: 16266
How about creating a hash and counting the occurrences? Then, only use those that have one (1) occurrence. This would avoid reading both files twice.
=== undupe.ps1
$hash = @{}
Get-Content 'testfile1.txt', 'testfile2.txt' | ForEach-Object { $hash[$_]++ }
foreach ($key in $hash.Keys) { if ($hash[$key] -eq 1) { Write-Output $key } }
Run it from a cmd shell or .bat file script.
powershell -NoLogo -NoProfile -File "undupe.ps1" >output.txt
Upvotes: 0