minuvnath
minuvnath

Reputation: 51

PowerShell multithreading for get-childItem

I have a requirement of scanning a CIFShare and get the file properties and ACL properties of all the files and folders in the share.I know there is an option of -recursive in get-chilItem but in case of very large shares, using the -recursive option is really time consuming.I know this can be achieved with multithreading.

Assume the hierarchy is like the following:-

Root
Root\FolderA
Root\FolderA\FolderA1\FolderA2\FolderA3\FolderA3\FolderA4
Root\FolderB\..
..

I have managed a script which gets the file properties and ACL of all the files and folders in the root and start a job for each of the folder in the root (Folder A, Folder B etc) which runs without any error. I tried creating jobs for each and every folder (all the levels in the directory structure) and this in turn results in to the job getting hung or the powershell getting force closed. I am using PowerShell V2.0 and any upgradation in the version is not possible in our environment. I am new to powershell and kindly forgive if it's a very silly question.

Thanks in advance for the help.

Upvotes: 2

Views: 6482

Answers (3)

websch01ar
websch01ar

Reputation: 2123

Do you have PowerShell 3 available on the machine? If you do, then you can create a Workflow that takes an arraylist of folders. I do not have a snippet for doing this, but if you are interested I can come up with something.

Edit (adding pseudo code below):

workflow GetFileInformation
{
    param([System.IO.FileSystemInfo[]] $folders)

    foreach -parallel ($folder in $folders)
    {
        inlinescript 
        {
            $files = GCI -LiteralPath $folder.FullName -File
            # Here you will have an Array of System.IO.FileSystemInfo
            # I do not know what you want to do from here, 
            # but the caller will have no visibility of this object 
            # since it is on a separate thread.
            # but you can write the results to a file or database.
            # Hope this helps some.
        }
    }
}

$dir = GCI C:\ -Directory -Recurse
GetFileInformation $dir

Upvotes: 3

mjolinor
mjolinor

Reputation: 68331

I would not use PowerShell jobs for this. Getting file and ACL information is a relative trivial task, and there are built-in executables available. Initializing a Powershell job session is a rather substantial investment in resources and not really a good investment of resources for trivial tasks.

Instead of jobs, I would use the legacy dir and cacls\icacls to get the file and ACL information, with the output piped to files for collection and aggregation later. Use the powershell script to create and launch the cmd processes, monitor the progress of the created processes to keep the thread creation throttled. Then go back with another script to collect and aggregate the information from the files.

IMHO

Upvotes: 0

alroc
alroc

Reputation: 28194

I tried creating jobs for each and every folder (all the levels in the directory structure) and this in turn results in to the job getting hung or the powershell getting force closed.

That's because you're not throttling the job creation. You're probably creating hundreds, if not thousands, of parallel jobs which is going to exhaust memory on any server. Starting multiple parallel jobs or threads is great and can improve overall execution time - until you create so many that your system can't handle the load.

See this SO answer for a method to throttle the number of jobs to a reasonable count. To avoid resource contention, I would recommend keeping the job count under 10 except on very large servers with very fast storage.

Upvotes: 1

Related Questions