TheQuestion
TheQuestion

Reputation: 33

Compare two lists of files and their content

While this might seem simple (and it might be!) I can't seem to find a way to solve it.

What I am trying to do is compare two lists of filtered files by their content. A example of this would be if two lists came back saying that they had a item called file.config at the location Stuff\files\morefiles then this would compare those files together and output where and what the changes were. Essentially, doing a diff of the .config files and showing where the changes are. This is normally simple for comparing two files (compare-object and such can be used) but because it is two lists of files rather then individual ones I am at a loss.

I need to do this to show a list of all changes needed to config files in a upgrade of software, so from one version of the software to the next, what are the changes made to the config files. I'm doing this in powershell because of the ability to easily interact with HG mercurial and be run by less experienced users (via a bat file).

The goal is to have a .txt file listing all the files that are changed in the new installation compared with the old one, or something similar.

Here's what I have so far:

$A = Get-ChildItem -Recurse -path "C:\repos\Dev\Projects\Bat\CurrentVersionRepoCloneTemp" -filter "*.config"

$B = Get-ChildItem -Recurse -path "C:\repos\Dev\Projects\Bat\UpgradeVersionRepoCloneTemp" -filter "*.config"

$C = Compare-Object $A $B -Property ('Name', 'Length') -PassThru | Where-Object {$_.FullName -eq $_.FullName} | ForEach-Object 
{    
    Compare-Object (Get-Content FileA)(Get-Content FileB) #I know this doesn't work 
}$C

Ideas or solutions?

Upvotes: 3

Views: 1318

Answers (2)

TheMadTechnician
TheMadTechnician

Reputation: 36322

Tim Ferrill's idea for checking updated files seems like a much better way to compare the files. Do something like

$A = Get-ChildItem -Recurse -path "C:\repos\Dev\Projects\Bat\CurrentVersionRepoCloneTemp" -filter "*.config"
$B = Get-ChildItem -Recurse -path "C:\repos\Dev\Projects\Bat\UpgradeVersionRepoCloneTemp" -filter "*.config"
$A | %{$_ | Add-Member "MD5" ([System.BitConverter]::ToString($md5.ComputeHash([System.IO.File]::ReadAllBytes($_))))}
$B | %{$_ | Add-Member "MD5" ([System.BitConverter]::ToString($md5.ComputeHash([System.IO.File]::ReadAllBytes($_))))}

Then I'd do the compare and group by directory.

$C = Compare-Object $A $B -Property ('Name', 'MD5') - Passthrough | Group Directory

After that, getting actual changes, that's going to be a little slow. Doing a line-by-line match of file contents is rough, but if they aren't too large it should still happen in a blink of an eye. I'd suggest something like:

$Output = @()
ForEach($File in $C[1].Group){
    $OldData = GC $File
    $C[0].Group | ?{$_.Name -eq $File.Name} | %{
        $NewData = GC $_
        $UpdatedLines = $NewData | ?{$OldData -inotcontains $_}
        $OldLines = $OldData | ?{$NewData -inotcontains $_}
        $Output += New-Object PSObject -Property @{
            UpdatedFile=$_.FullName
            OriginalFile=$File.FullName
            Changes=$UpdatedLines
            Removed=$OldLines
        }
    }
}

Once you have that you just have to output it in something readable. Maybe something like this:

Get-Date | Out-File "C:\repos\Dev\Projects\Bat\UpgradeVersionRepoCloneTemp\ChangeLog.txt"
$Output|%{$_|FT OriginalFile,UpdatedFile; "New/Changed Lines"; "-----------------"; $_.Changes; " "; "Old/Removed Lines"; "-----------------"; $_.Removed} | Out-File "C:\repos\Dev\Projects\Bat\UpgradeVersionRepoCloneTemp\ChangeLog.txt" -Append

Upvotes: 2

Tim Ferrill
Tim Ferrill

Reputation: 1674

You could do a checksum of each file and compare that...

$md5 = new-object -TypeName System.Security.Cryptography.MD5CryptoServiceProvider
$hash = [System.BitConverter]::ToString($md5.ComputeHash([System.IO.File]::ReadAllBytes($file)))

Upvotes: 3

Related Questions