Reputation: 193
I have two text files that contain many duplicate lines. I would like to run a powershell statement that will output a new file with only the values NOT already in the first file. Below is an example of two files.
File1.txt
-----------
Alpha
Bravo
Charlie
File2.txt
-----------
Alpha
Echo
Foxtrot
In this case, only Echo and Foxtrot are not in the first file. So these would be the desired results.
OutputFile.txt
------------
Echo
Foxtrot
I reviewed the below link which is similar to what I want, but this does not write the results to an output file.
Remove lines from file1 that exist in file2 in Powershell
Upvotes: 2
Views: 4980
Reputation: 1388
Using the approach in the referenced link will work however, for every line in the original file, it will trigger the second file to be read from disk. This could be painful depending on the size of your files. I think the following approach would meet your needs.
$file1 = Get-Content .\File1.txt
$file2 = Get-Content .\File2.txt
$compareParams = @{
ReferenceObject = $file1
DifferenceObject = $file2
}
Compare-Object @compareParams |
Where-Object -Property SideIndicator -eq '=>' |
Select-Object -ExpandProperty InputObject |
Out-File -FilePath .\OutputFile.txt
This code does the following:
Compare-Object
(see about_Splatting for more information)Out-File
If you are comfortable with the overall flow of this, and are only using this in one-off situations, the whole thing can be compressed into a one-liner.
(Compare-Object (gc .\File1.txt) (gc .\File2.txt) | ? SideIndicator -eq '=>').InputObject | Out-File .\OutputFile.txt
Upvotes: 2
Reputation: 1855
Here's one way to do it:
# Get unique values from first file
$uniqueFile1 = (Get-Content -Path .\File1.txt) | Sort-Object -Unique
# Get lines in second file that aren't in first and save to a file
Get-Content -Path .\File2.txt | Where-Object { $uniqueFile1 -notcontains $_ } | Out-File .\OutputFile.txt
Upvotes: 3