Reputation: 617
I have 2 text files, one of them is called Invoice1.txt and the other Invoice2.txt. Both files are in the same format.
Invoice1.txt contains:
H~30011000 ~More data ...
L~13332 ~More Data...
L~13332 ~more Data...
and Invoice2.txt contains:
H~30011000 ~More data ...
L~13332 ~More Data...
L~13332 ~More Data...
H~30022000 ~More Data...
L~13999 ~More Data...
L~13999 ~More Data...
Essentially what I am trying to do is create a new file which contains non-duplicate row data from both files as explained below. The last 3 lines from Invoice2.txt are NOT in Invoice1.txt so that would be appended to the new file.
My desired Output would be:
H~30022000 ~More Data...
L~13999 ~More Data...
L~13999 ~More Data...
How would I write it with Powershell. Would I have to get-content
for both .txt files and select the objects that are not equal?
$file1 = "C:\Invoice1.txt"
$file2 = "C:\Invoice2.txt"
$results = "C:\NonDuplicate.txt"
Upvotes: 0
Views: 122
Reputation: 46710
If the files are small then Compare-Object
would be great for this.
Compare-Object -ReferenceObject (Get-Content $file1) -DifferenceObject (Get-Content $file2) -PassThru |
Set-Content $results
This would literally give you the results you asked for with minor coding. It has issues with blank lines it seems so you might need to post process some of those out depending on what you exactly want for results. -PassThru
is there so the custom objects that Compare-Object
typically makes are avoided. Instead the lines that do not match are pushed thru. You use temp variables for the files contents but why bother if you are only going to use them once.
Compare-Object -ReferenceObject (Get-Content $file1) -DifferenceObject (Get-Content $file2) -PassThru |
Where-Object{![string]::IsNullOrWhiteSpace($_)}
If your files are larger then this might not be efficient.
Upvotes: 1
Reputation: 17472
Other proposal, which works regardless of how many files you have
$dirwithfile="C:\temp\test"
#extract list of files
$listfile=gci "$dirwithfile\Invoice*.txt" -file
#for every file, get content and extract only rows which not exist in other content file, write result into file NonDuplicate.txt
$listfile |
%{ $filename=$_.Name; gc $_ | where {$row=$_; ($listfile | where Name -ne $filename | gc) -notcontains $row } } |
out-file "$dirwithfile\NonDuplicate.txt" -Append
Upvotes: 1
Reputation:
Edit adapted to OPs preliminaries
$file1 = ".\Invoice1.txt"
$file2 = ".\Invoice2.txt"
$results = ".\NonDuplicate.txt"
$Content = Get-Content $File1
Get-Content $File2 |
ForEach { if ($Content -notcontains $_) {$_} }|
Set-Content $Results
This is another step easier :
Get-Content $File2 | Where {$Content -notcontains $_}| Set-Content $Results
Output
> cat .\NonDuplicate.txt
H~30022000 ~More Data...
L~13999 ~More Data...
L~13999 ~More Data...
Upvotes: 3
Reputation: 9163
There are multiple ways to achieve it . But I made a simple one for you and explained on each line how things are working.
Below is the script and screenshots for your reference.
# Taking input from both the files
$file1= Get-Content E:\Source_Test\invoice1.txt
$file2= Get-Content E:\Source_Test\invoice2.txt
# Ignoring the case sensitivity . So making it to lowercase. Parsing it to get non duplicates in each file and appendind the result to the file
($file1).tolower() |sort | Get-Unique | Out-File E:\source_test\NonDuplicate.txt -Append -Force
($file2).tolower() |sort | Get-Unique | Out-File E:\source_test\NonDuplicate.txt -Append -Force
# Getting the data from both the files and again taking the non-duplicates and finally storing in the file
$file3=Get-Content E:\Source_Test\NonDuplicate.txt
($file3).ToLower() | sort | Get-Unique | Out-File E:\Source_Test\nonduplicate.txt -Force
Images:
Hope it helps...
Upvotes: 2