Gowtham Sarathy
Gowtham Sarathy

Reputation: 35

Powershell compare text files and write difference lines to a new file

File 1.txt

Abc
Def
Xyz

File 2.txt

Xyz
Def

Abc is not found in file 2 when file 1 is compared against file 2, So want to write Abc to a new file diff.txt.

diff.txt

Abc

I saw many posts using compare-object but those are not producing outputs like what my requirement was. I am new to Powershell

Upvotes: 0

Views: 5275

Answers (1)

RoadRunner
RoadRunner

Reputation: 26335

From my understanding you want to write all lines from File1.txt than don't exist in File2.txt.

We can use Get-Content to read both files into an array of strings, and use Where-Object to filter lines from File1.txt that are -notin File2.txt. We can then output the differences to a new file with Out-File.

$file2 = Get-Content -Path .\File2.txt

$diff = Get-Content -Path .\File1.txt | Where-Object {$_ -notin $file2}

$diff | Out-File -FilePath diff.txt

However, for larger files, doing a O(N) linear search with -notin can be expensive. Instead, we can use a System.Collections.Generic.HashSet<T> for constant time O(1) lookups using System.Collections.Generic.HashSet<T>.Contains(T).

For the below example I use System.Linq.Enumerable.ToHashSet to create this hash set, which uses the array of strings from Get-Content as an System.Collections.Generic.IEnumerable<T>.

$file2HashSet = [Linq.Enumerable]::ToHashSet(
  [string[]] (Get-Content -Path .\File2.txt),
  [StringComparer]::CurrentCultureIgnoreCase
)

$diff = Get-Content -Path .\File1.txt | Where-Object {-not $file2HashSet.Contains($_)}

$diff | Out-File -FilePath diff.txt

diff.txt

Abc

Upvotes: 2

Related Questions