Envay
Envay

Reputation: 21

Match Strings in two different files and output this + following line

From a rather complicated contract I have two .txt files. The files look as follows:

File1:

Hash: 5FD876CDF0FCFAF1E7F018F5A8519A7B

Path: /Users/foobar/Desktop/whatever.jpg

File2:

Hash: 0EDEFB152D489163E603F8C55F5463A7

Path: c:\Migration\Templates\script.exe

The goal here is, to compare File1 and File2 and find matching Hashes. After the matching hashes were found, output them (and the following path) into another .txt file.

I've googled the problem excessively, but most solutions are meant to find differences or are designed differently, my expertise in Powershell is not far enough to be able to rewrite them properly.

# pattern to match the Hashes from File2
$pattern = "^[a-f0-9]{32}$"

# read in File1 as an array to compare to
$set1 = Get-Content -Path 'C:\File1.txt'

# prepare to collect the results
$results = New-Object -TypeName System.Text.StringBuilder

# start scanning the text2 file for matches to text1 entries
Get-Content -Path 'C:\File2.txt'
if ($_ -match $pattern)
   {
        $hashes = $Matches['hash']
        if ($hashes -in $set1)
        {
            [void]$results.AppendLine($_)
        }
    }
}

# output the results
$results.ToString() | Out-File -FilePath 'C:\output.txt' -Encoding ascii

Above code doesn't quite match far enough yet, I would require help in getting the finishing touches on it!

Thank you for reading my post!

Upvotes: 2

Views: 220

Answers (1)

user6811411
user6811411

Reputation:

  • Your pattern anchors at line begin (^) and doesn't include the prefix Hash:.
  • Also to get the path you need a multiline pattern requiring to read the file with the -raw parameter and include the switches (?sm) in the pattern.
  • Also it's unclear if there are several Hash:/Path: combination in one file and how many files to extract the pairs from.

I suggest to:

  • use a function to extract the information and either
    1. if only two files use Compare-Object to get matching pairs, or
    2. if more files, gather output and Group-Object by the Hash and if more than 1 count per Group output the Group with all pathes.

The following script extracts Hash/Path pairs from a given file and outputs them as a [PSCustomObject].

## Q:\Test\2019\07\20\SO_57124977.ps1

function Extract-HashPath($filename){
    $Pattern = '(?sm)^Hash: (?<Hash>[0-9A-F]{32})(\r?\n)*Path: (?<Path>.*?)$'
    ((Get-Content $filename -raw) | 
        Select-String -Pattern $Pattern -AllMatches).Matches.Groups | ForEach-Object{
            Set-Variable -Name $_.Name -Value $_.Value
            if ($_.Name -eq 'Path'){
                [PSCustomObject]@{
                    Hash = $Hash
                    Path = $Path
                }
            }
        }

}

$File1HashPath = Extract-HashPath '.\File1.txt'
$File2HashPath = Extract-HashPath '.\File2.txt'
$File1HashPath
$File2HashPath

Sample output with text from your above file (no comparable Hashes contained therein)

> Q:\Test\2019\07\20\SO_57124977.ps1

Hash                             Path
----                             ----
5FD876CDF0FCFAF1E7F018F5A8519A7B /Users/foobar/Desktop/whatever.jpg
0EDEFB152D489163E603F8C55F5463A7 c:\Migration\Templates\script.exe

Upvotes: 1

Related Questions