Dronzil
Dronzil

Reputation: 31

Using Powershell to compare two files and then output only the different string names

So I am a complete beginner at Powershell but need to write a script that will take a file, compare it against another file, and tell me what strings are different in the first compared to the second. I have had a go at this but I am struggling with the outputs as my script will currently only tell me on which line things are different, but it also seems to count lines that are empty too.

To give some context for what I am trying to achieve, I would like to have a static file of known good Windows processes ($Authorized) and I want my script to pull a list of current running processes, filter by the process name column so to just pull the process name strings, then match anything over 1 character, sort the file by unique values and then compare it against $Authorized, plus finally either outputting the different process strings found in $Processes (to the ISE Output Pane) or just to output the different process names to a file.

I have spent today attempting the following in Powershell ISE and also Googling around to try and find solutions. I heard 'fc' is a better choice instead of Compare-Object but I could not get that to work. I have thus far managed to get it to work but the final part where it compares the two files it seems to compare line by line, for which would always give me false positives as the line position of the process names in the file supplied would change, furthermore I only want to see the changed process names, and not the line numbers which it is reporting ("The process at line 34 is an outlier" is what currently gets outputted).

I hope this makes sense, and any help on this would be very much appreciated.

Get-Process | Format-Table -Wrap -Autosize -Property ProcessName | Outfile c:\users\me\Desktop\Processes.txt    
$Processes = 'c:\Users\me\Desktop\Processes.txt'
$Output_file = 'c:\Users\me\Desktop\Extracted.txt'
$Sorted = 'c:\Users\me\Desktop\Sorted.txt'
$Authorized = 'c:\Users\me\Desktop\Authorized.txt'
$regex = '.{1,}'
select-string -Path $Processes -Pattern $regex |% { $_.Matches } |% { $_.Value } > $Output_file
Get-Content $Output_file | Sort-Object -Unique > $Sorted
$dif = Compare-Object -ReferenceObject $(Get-Content $Sorted) -DifferenceObject $(get-content $Authorized) -IncludeEqual 
$lineNumber = 1
foreach ($difference in $dif)
{
if ($difference.SideIndicator -ne "==") 
{
Write-Output "The Process at Line $linenumber is an Outlier"
}
$lineNumber ++
}
Remove-Item c:\Users\me\Desktop\Processes.txt
Remove-Item c:\Users\me\Desktop\Extracted.txt
Write-Output "The Results are Stored in $Sorted"

Upvotes: 1

Views: 5851

Answers (1)

TessellatingHeckler
TessellatingHeckler

Reputation: 28973

From the length and complexity of your script, I feel like I'm missing something, but your description seems clear

  1. Running process names:
    • $ProcessNames = @(Get-Process | Select-Object -ExpandProperty Name)
    • .. which aren't blank: $ProcessNames = $ProcessNames | Where-Object {$_ -ne ''}
  2. List of authorised names from a file:
    • $AuthorizedNames = Get-Content 'c:\Users\me\Desktop\Authorized.txt'
  3. Compare:
    • $UnAuthorizedNames = $ProcessNames | Where-Object { $_ -notin $AuthorizedNames }
  4. optional output to file:
    • $UnAuthorizedNames | Set-Content out.txt

or in the shell:

@(gps).Name -ne '' |? { $_ -notin (gc authorized.txt) } | sc out.txt

1  2    3     4     5        6      7                      8

1. @() forces something to be an array, even if it only returns one thing
2. gps is a default alias of Get-Process
3. using .Property on an array takes that property value from every item in the array
4. using an operator on an array filters the array by whether the items pass the test
5. ? is an alias of Where-Object
6. -notin tests if one item is not in a collection
7. gc is an alias of Get-Content
8. sc is an alias of Set-Content

You should use Set-Content instead of Out-File and > because it handles character encoding nicely, and they don't. And because Get-Content/Set-Content sounds like a memorable matched pair, and Get-Content/Out-File doesn't.

Upvotes: 1

Related Questions