SQL_Deadwood
SQL_Deadwood

Reputation: 521

Remove known Excel passwords with PowerShell

I have this PowerShell code that loops through Excel files in a specified directory; references a list of known passwords to find the correct one; and then opens, decrypts, and saves that file to a new directory.

But it's not executing as quickly as I'd like (it's part of a larger ETL process and it's a bottleneck). At this point I can remove the passwords faster manually as the script takes ~40 minutes to decrypt 40 workbooks while referencing a list of ~50 passwords.

Is there a cmdlet or function (or something) that's missing which would speed this up, an overlooked flaw in the processing, or is PowerShell, perhaps, just not the right tool for this job?

Original Code (updated code can be found below):

$ErrorActionPreference = "SilentlyContinue"

CLS

# Paths
$encrypted_path = "C:\PoShTest\Encrypted\"
$decrypted_Path = "C:\PoShTest\Decrypted\"
$original_Path =  "C:\PoShTest\Originals\"
$password_Path =  "C:\PoShTest\Passwords\Passwords.txt"

# Load Password Cache
$arrPasswords = Get-Content -Path $password_Path

# Load File List
$arrFiles = Get-ChildItem $encrypted_path

# Create counter to display progress
[int] $count = ($arrfiles.count -1)

# Loop through each file
$arrFiles| % {
    $file  = get-item -path $_.fullname
    # Display current file
    write-host "Processing" $file.name -f "DarkYellow"
    write-host "Items remaining: " $count `n

    # Excel xlsx
    if ($file.Extension -eq ".xlsx") {

    # Loop through password cache
        $arrPasswords | % {
            $passwd = $_

            # New Excel Object
            $ExcelObj = $null
            $ExcelObj = New-Object -ComObject Excel.Application
            $ExcelObj.Visible = $false

            # Attempt to open file
            $Workbook = $ExcelObj.Workbooks.Open($file.fullname,1,$false,5,$passwd)
            $Workbook.Activate()

            # if password is correct - Save new file without password to $decrypted_Path
                if ($Workbook.Worksheets.count -ne 0) {
                    $Workbook.Password=$null
                    $savePath = $decrypted_Path+$file.Name
                    write-host "Decrypted: " $file.Name -f "DarkGreen"
                    $Workbook.SaveAs($savePath)
            # Close document and Application
                    $ExcelObj.Workbooks.close()
                    $ExcelObj.Application.Quit()

            # Move original file to $original_Path
                    move-item $file.fullname -Destination $original_Path -Force
                }
                else {
            # Close document and Application
                    write-host "PASSWORD NOT FOUND: " $file.name -f "Magenta"
                    $ExcelObj.Close()
                    $ExcelObj.Application.Quit()
                }
        }

    }

$count--
# Next File
}

Write-host "`n Processing Complete" -f "Green"

Updated code:

# Get Current EXCEL Process ID's so they are not affected but the scripts cleanup
# SilentlyContinue in case there are no active Excels
$currentExcelProcessIDs = (Get-Process excel -ErrorAction SilentlyContinue).Id

$a = Get-Date

$ErrorActionPreference = "SilentlyContinue"

CLS

# Paths
$encrypted_path = "C:\PoShTest\Encrypted"
$decrypted_Path = "C:\PoShTest\Decrypted\"
$processed_Path = "C:\PoShTest\Processed\"
$password_Path  = "C:\PoShTest\Passwords\Passwords.txt"

# Load Password Cache
$arrPasswords = Get-Content -Path $password_Path

# Load File List
$arrFiles = Get-ChildItem $encrypted_path

# Create counter to display progress
[int] $count = ($arrfiles.count -1)

# New Excel Object
$ExcelObj = $null
$ExcelObj = New-Object -ComObject Excel.Application
$ExcelObj.Visible = $false

# Loop through each file
$arrFiles| % {
    $file  = get-item -path $_.fullname
    # Display current file
    write-host "`n Processing" $file.name -f "DarkYellow"
    write-host "`n Items remaining: " $count `n

    # Excel xlsx
    if ($file.Extension -like "*.xls*") {

    # Loop through password cache
        $arrPasswords | % {
            $passwd = $_

            # Attempt to open file
            $Workbook = $ExcelObj.Workbooks.Open($file.fullname,1,$false,5,$passwd)
            $Workbook.Activate()

            # if password is correct, remove $passwd from array and save new file without password to $decrypted_Path
                if ($Workbook.Worksheets.count -ne 0) 

                {   
                    $Workbook.Password=$null
                    $savePath = $decrypted_Path+$file.Name
                    write-host "Decrypted: " $file.Name -f "DarkGreen"
                    $Workbook.SaveAs($savePath)

             # Added to keep Excel process memory utilization in check
                    $ExcelObj.Workbooks.close()

             # Move original file to $processed_Path
                    move-item $file.fullname -Destination $processed_Path -Force

                }
                else {
            # Close Document
                    $ExcelObj.Workbooks.Close()
                }
        }

    }



$count--
# Next File
}
# Close Document and Application
    $ExcelObj.Workbooks.close()
    $ExcelObj.Application.Quit()

Write-host "`nProcessing Complete!" -f "Green"
Write-host "`nFiles w/o a matching password can be found in the Encrypted folder."
Write-host "`nTime Started   : " $a.ToShortTimeString()
Write-host "Time Completed : " $(Get-Date).ToShortTimeString()
Write-host "`nTotal Duration : " 
NEW-TIMESPAN –Start $a –End $(Get-Date)

# Remove any stale Excel processes created by this script's execution
Get-Process excel -ErrorAction SilentlyContinue | Where-Object{$currentExcelProcessIDs -notcontains $_.id} | Stop-Process

Upvotes: 3

Views: 5678

Answers (2)

Robin
Robin

Reputation: 1645

This solution uses the modules ImportExcel for easier working with Excel files, and PoshRSJob for multithreaded processing.

If you do not have these, install them by running:

Install-Module ImportExcel -scope CurrentUser
Install-Module PoshRSJob -scope CurrentUser

I've raised an issue on the ImportExcel module GitHub page where I've proposed a solution to open encrypted Excel files. The author may propose a better solution (and consider the impact across other functions in the module, but this works for me). For now, you'll need to make a modification to the Import-Excel function yourself:

Open: C:\Username\Documents\WindowsPowerShell\Modules\ImportExcel\2.4.0\ImportExcel.psm1 and scroll to the Import-Excel function. Replace:

[switch]$DataOnly

With

[switch]$DataOnly,
[String]$Password

Then replace the following line:

$xl = New-Object -TypeName OfficeOpenXml.ExcelPackage -ArgumentList $stream

With the code suggested here. This will let you call the Import-Excel function with a -Password parameter.

Next we need our function to repeatedly try and open a singular Excel file using a known set of passwords. Open a PowerShell window and paste in the following function (note: this function has a default output path defined, and also outputs passwords in the verbose stream - make sure no-one is looking over your shoulder or just remove that if you'd prefer):

function Remove-ExcelEncryption
{
    [CmdletBinding()]
    Param
    (
        [Parameter(Mandatory=$true)]
        [String]
        $File,

        [Parameter(Mandatory=$false)]
        [String]
        $OutputPath = 'C:\PoShTest\Decrypted',

        [Parameter(Mandatory=$true)]
        [Array]
        $PasswordArray
    )

    $filename = Split-Path -Path $file -Leaf

    foreach($Password in $PasswordArray)
    {
        Write-Verbose "Attempting to open $file with password: $Password"
        try
        {
            $ExcelData = Import-Excel -path $file -Password $Password -ErrorAction Stop
            Write-Verbose "Successfully opened file."
        }
        catch
        {
            Write-Verbose "Failed with error $($Error[0].Exception.Message)"
            continue
        }

        try
        {
            $null = $ExcelData | Export-Excel -Path $OutputPath\$filename
            return "Success"
       }
        catch
        {
            Write-Warning "Could not save to $OutputPath\$filename"
        }
    }
}

Finally, we can run code to do the work:

$Start = get-date
$PasswordArray = @('dj7F9vsm','kDZq737b','wrzCgTWk','DqP2KtZ4')
$files = Get-ChildItem -Path 'C:\PoShTest\Encrypted'
$files | Start-RSJob -Name {$_.Name} -ScriptBlock {
    Remove-ExcelEncryption -File $_.Fullname -PasswordArray $Using:PasswordArray -Verbose
} -FunctionsToLoad Remove-ExcelEncryption -ModulesToImport Import-Excel | Wait-RSJob | Receive-RSJob
$end = Get-Date
New-TimeSpan -Start $Start -End $end

For me, if the correct password is first in the list it runs in 13 seconds against 128 Excel files. If I call the function in a standard foreach loop, it takes 27 seconds.

To view which files were successfully converted we can inspect the output property on the RSJob objects (this is the output of the Remove-ExcelEncryption function where I've told it to return "Success"):

Get-RSJob | Select-Object -Property Name,Output

Hope that helps.

Upvotes: 2

Matt
Matt

Reputation: 46710

If nothing else I do see one glaring performance issue that should be easy to address. You are opening a new excel instance for testing each individual password for each document. 40 workbooks with 50 passwords mean you have opened 2000 Excel instances one at a time.

You should be able to keep using the same one without a functionality hit. Get this code out of your inner most loop

# New Excel Object
$ExcelObj = $null
$ExcelObj = New-Object -ComObject Excel.Application
$ExcelObj.Visible = $false

as well as the snippet that would close the process. It would need to be out of the loop as well.

$ExcelObj.Close()
$ExcelObj.Application.Quit()

If that does not help enough you would have to consider doing some sort of parallel processing with jobs etc. I have a basic solution in a CodeReview.SE answer of mine doing something similar.

Basically what it does is run several excels at once where each one works on a chunk of documents which runs faster than one Excel doing them all. Just like I do in the linked answer I caution the automation of Excel COM with PowerShell. COM objects don't always get released properly and locks can be left on files or processes.


You are looping for all 50 passwords regardless of success or not. That means you could find the right password on the first go but you are still going to try the other 49! Set a flag in the loop to break that inner loop when that happens.

As far as the password logic goes you say that

At this point I can remove the passwords faster manually since the script takes ~40 minutes

Why can you do it faster? What do you know that the script does not. I don't see you being able to out perform the script but doing exactly what it does.

With what I see another suggestion would be to keep/track successful passwords and associated file name. So that way when it gets processed again you would know the first password to try.

Upvotes: 3

Related Questions