Cesar
Cesar

Reputation: 617

Extract different rows from 2 text files and append to new file

I have 2 text files, one of them is called Invoice1.txt and the other Invoice2.txt. Both files are in the same format.

Invoice1.txt contains:

H~30011000 ~More data ...

L~13332 ~More Data...

L~13332 ~more Data...

and Invoice2.txt contains:

H~30011000 ~More data ...

L~13332 ~More Data...

L~13332 ~More Data...

H~30022000 ~More Data...

L~13999 ~More Data...

L~13999 ~More Data...

Essentially what I am trying to do is create a new file which contains non-duplicate row data from both files as explained below. The last 3 lines from Invoice2.txt are NOT in Invoice1.txt so that would be appended to the new file.

My desired Output would be:

H~30022000 ~More Data...

L~13999 ~More Data...

L~13999 ~More Data...

How would I write it with Powershell. Would I have to get-content for both .txt files and select the objects that are not equal?

$file1 = "C:\Invoice1.txt"
$file2 = "C:\Invoice2.txt"
$results = "C:\NonDuplicate.txt"

Upvotes: 0

Views: 122

Answers (4)

Matt
Matt

Reputation: 46710

If the files are small then Compare-Object would be great for this.

Compare-Object -ReferenceObject (Get-Content $file1) -DifferenceObject (Get-Content $file2) -PassThru | 
    Set-Content $results

This would literally give you the results you asked for with minor coding. It has issues with blank lines it seems so you might need to post process some of those out depending on what you exactly want for results. -PassThru is there so the custom objects that Compare-Object typically makes are avoided. Instead the lines that do not match are pushed thru. You use temp variables for the files contents but why bother if you are only going to use them once.

Compare-Object -ReferenceObject (Get-Content $file1) -DifferenceObject (Get-Content $file2) -PassThru | 
Where-Object{![string]::IsNullOrWhiteSpace($_)}

If your files are larger then this might not be efficient.

Upvotes: 1

Esperento57
Esperento57

Reputation: 17472

Other proposal, which works regardless of how many files you have

$dirwithfile="C:\temp\test"

#extract list of files
$listfile=gci "$dirwithfile\Invoice*.txt" -file

#for every file, get content and extract only rows which not exist in other content file, write result into file NonDuplicate.txt
$listfile |  
    %{ $filename=$_.Name;  gc $_ | where {$row=$_; ($listfile | where Name -ne $filename | gc) -notcontains $row   } } |
        out-file "$dirwithfile\NonDuplicate.txt" -Append

Upvotes: 1

user6811411
user6811411

Reputation:

Edit adapted to OPs preliminaries

$file1 = ".\Invoice1.txt"
$file2 = ".\Invoice2.txt"
$results = ".\NonDuplicate.txt"
$Content = Get-Content $File1 
Get-Content $File2 |
  ForEach { if ($Content -notcontains $_) {$_} }|
    Set-Content $Results

This is another step easier :

Get-Content $File2 | Where {$Content -notcontains $_}| Set-Content $Results

Output

> cat .\NonDuplicate.txt
    H~30022000 ~More Data...
    L~13999 ~More Data...
    L~13999 ~More Data...

Upvotes: 3

Ranadip Dutta
Ranadip Dutta

Reputation: 9163

There are multiple ways to achieve it . But I made a simple one for you and explained on each line how things are working.

Below is the script and screenshots for your reference.

# Taking input from both the files 
$file1= Get-Content E:\Source_Test\invoice1.txt 
$file2= Get-Content E:\Source_Test\invoice2.txt
# Ignoring the case sensitivity . So making it to lowercase. Parsing it to get non duplicates in each file and appendind the result to the file
($file1).tolower() |sort |  Get-Unique | Out-File E:\source_test\NonDuplicate.txt -Append -Force
($file2).tolower() |sort |  Get-Unique | Out-File E:\source_test\NonDuplicate.txt -Append -Force
# Getting the data from both the files and again taking the non-duplicates and finally storing in the file
$file3=Get-Content E:\Source_Test\NonDuplicate.txt
($file3).ToLower() | sort | Get-Unique | Out-File E:\Source_Test\nonduplicate.txt -Force

Images:

Invoice 1

Invoice 2

Non-Duplicates

Hope it helps...

Upvotes: 2

Related Questions