kindzmarauli
kindzmarauli

Reputation: 176

Powershell Get-ChildItem progress indication

Is it possible to display any sort of a progress indicator for the following (or similar) command?

$dir = 'C:\$Recycle.Bin\S-1-5-18\'
(Get-ChildItem -LiteralPath $dir -File -Recurse -Force -ErrorAction SilentlyContinue | Measure-Object -Property Length -Sum).Sum / 1GB

(piping its output to something like %{Write-Output processing item: $_.fullname; $_} suggested in one of the answers to "Powershell Get-ChildItem progress question" - doesn't seem to work. For instance,

$dir = 'C:\$Recycle.Bin\S-1-5-18\'
(Get-ChildItem -LiteralPath $dir -File -Force -ErrorAction SilentlyContinue | %{Write-Output processing item: $_.fullname; $_} | Measure-Object -Property Length -Sum).Sum / 1GB

... produces no output at all other than the final result (size in GB).

Context

On a number of our servers, C:\$Recycle.Bin\S-1-5-18\ (recycle bin directory for the "system" account) contains millions of files, hundreds or thousands of sub-directories (which may also contain a large number of files), and keeps filling up at a rate of 1-3GB/day.

The culprit is a misbehaving "line of business" application on these servers that seems to write a lot of temporary files to the recycle bin. At the moment, it is not an option to stop it or to change its behavior.

My priorities are these:

  1. Figure out a way to intelligently empty this recycle bin, perhaps based on item age (e.g. permanently delete files and sub-directories older than NN days, and only do it during certain hours - like between 1 and 4am daily)
  2. Figure out how to prevent the application from using the recycle bin altogether - e.g. by setting registry values for this account / volume (NukeOnDelete, NeedToPurge and MaxCapacity), or a GPO that mandates permanent file deletion.

The 1st part, clearing the files from the recycle bin using Powershell, seems to require to first get a list of items to delete via Get-ChildItem - which takes a long, long time on large directories residing on fairly slow disks.

Is there a way to get an indicator that Get-ChildItem is doing something when piped to a Measure-Object or a similar command (e.g. number of items it was able to collect so far)?

P.S. The reason I think it might be possible is because Get-ChildItem on its own (without a pipe to Measure-Object) begins output right away. Utilities like TreeSize seem to be able to show progress while they're "walking" the directory tree and collecting data. dir /a/s C:\$Recycle.Bin\S-1-5-18\ also begins output right away, even if it takes a long time to complete. Any chance we could get a progress indicator when we pipe it to a Measure-Object or a similar command?

Upvotes: 3

Views: 694

Answers (3)

mklement0
mklement0

Reputation: 437062

Generally speaking:

  • Get-ChildItem, like most (well-behaved) PowerShell cmdlets streams its output, so a subsequent pipeline segment should see input as soon as Get-ChildItem emits the first object.

    • The only caveat in PowerShell (Core) 7+ is that sorting a given directory's immediate entries is invariably performed first, behind the scenes, which can take a while in large directories.
  • However, enclosing any command in (...), the grouping operator invariably cause its success-stream output to be collected in full, up front, before success-stream output is produced.
    By contrast, output to any other output stream surfaces right away.


As for what you tried:

(Get-ChildItem -LiteralPath $dir -File -Force -ErrorAction SilentlyContinue | %{Write-Output processing item: $_.fullname; $_} | Measure-Object -Property Length -Sum).Sum / 1GB ... produces no output at all other than the final result (size in GB).

Unlike in the answer you linked to, you're using Write-Output - which targets the success output stream - rather than Write-Host, and that is what prevents streaming the status messages.
Additionally, in this case your status messages become part of the input that is "eaten" by Measure-Object- where they distort the result, given that strings too have a .Length property - so they don't print at all.

Write-Host bypasses the success-output stream, and prints to the display (in PowerShell v5+ via the information output stream), so it is capable of producing display (host) output even while success output is being collected via (...).

(Similarly, targeting the verbose output stream with Write-Verbose would work too, and would give you the option to conditionally display progress messages, based on the $VerbosePreference preference variable; if you pass it the -Verbose switch it'll produce output unconditionally)

See this answer for guidance on when use of Write-Host vs. Write-Output is called for.

Upvotes: 2

Tolga
Tolga

Reputation: 3695

This is a common problem. A simplified version of what you are doing is:

Get-ChildItem -Recurse | Measure-Object -Property Length -Sum

In the above, you lose visibility to the output of Get-ChildItem because you are sending the output into the pipeline, but you can easily mitigate that by adding to any part of the pipeline a section where you Write-Host and then simply put the object back onto the pipeline to have it continue:

Get-ChildItem -Recurse | % { Write-Host $_; $_ } | Measure-Object -Property Length -Sum

or written another way:

Get-ChildItem -Recurse | % { Write-Host $_; Write-Output $_ } | Measure-Object -Property Length -Sum

Of course, you do not just have to just Write-Host the $_, you can get fancier and do things like accumulate and report on totals and etc.

I see you are attempting to take this approach already, but looks like you accidentally used Write-Output instead of Write-Host, which ends up also sending your status text into the pipeline and the number of characters of your status text gets added to the sum of the length of the files you are measuring. LoL. Of course you can fix your code by just fixing that:

$dir = 'C:\$Recycle.Bin\S-1-5-18\'
(Get-ChildItem -LiteralPath $dir -File -Force -ErrorAction SilentlyContinue | %{Write-Host processing item: $_.fullname; $_} | Measure-Object -Property Length -Sum).Sum / 1GB

Upvotes: 3

Santiago Squarzon
Santiago Squarzon

Reputation: 59782

Perhaps this might help you, essentially it uses the .NET API from DirectoryInfo and a Queue<T> to traverse an initial directory. The advantage doing this will be the performance as well as you will be able to track the total length while traversing directories.

$queue = [System.Collections.Generic.Queue[System.IO.DirectoryInfo]]::new()
$initialItem = Get-Item 'C:\$Recycle.Bin\S-1-5-18\' -ErrorAction Stop
$queue.Enqueue($initialItem)
$totalLength = 0

$theOutputGoesHere = while ($queue.Count) {
    $current = $queue.Dequeue()

    $writeProgressSplat = @{
        Activity = 'Total Length: {0, 10} Gb' -f [Math]::Round($totalLength / 1gb, 2)
        Status   = "Processing: '{0}'" -f $current.FullName
    }

    Write-Progress @writeProgressSplat

    try {
        $enum = $current.EnumerateFileSystemInfos()
    }
    catch {
        # we can't enumerate this dir, add error handling here
        # or empty to ignore enumeration errors

        # dont remove this!
        continue
    }

    foreach ($item in $enum) {
        $item

        if ($item -is [System.IO.DirectoryInfo]) {
            $queue.Enqueue($item)
            continue
        }

        $totalLength += $item.Length
    }
}

Upvotes: 0

Related Questions