Reputation: 75
I have a script that uses Get-ChildItem to find specific files in a directory. I then use two different SQL tables to compare those files with constraints and delete the files if they meet certain criteria.
Basically this is what happens:
-- For reference the -include $include variable is a unique id (string) used as the filename. I'm deleting all files similar to that name.
Example:
$include: 9d3aa8ee-e60e-4b4f-9cd0-6678f8a5549e*.*
Query table #1, put results in an array.
Query table #2, put results in an array.
~~~ Psuedo code ~~~
foreach ($i in table #1) {
foreach ($x in table #2) {
if (constraints are met) {
$files = Get-ChildItem -Path $path -Recurse -include $include | foreach-object -process { $_.FullName }
Delete the files
}
}
}
My problem: There are approximately 14 million files on this server.
I've run the script on a test server with about 1.5 million files, and it takes almost two hours.
I tried to run this script on the live server, but after three days it still had not completed.
How can I do this?
Upvotes: 0
Views: 9344
Reputation: 681
Well, I don't know what you mean by some constraints. But a couple of years back, I had written a cmdlet called Find-ChildItem which is an alternative to Get-ChildItem.
It has more options built-in such as delete files greater than some size and older than some time or delete only empty files. This might help you get rid of some additional loops and cmdlets from your script and thereby an increase in performance. You may want to give it a try.
You can get more details about this Find-ChildItem cmdlet on my blog, Unix / Linux find equivalent in Powershell Find-ChildItem Cmdlet .
I hope this helps you a bit...
Upvotes: 0
Reputation: 1067
With 14 million files to work with, just how long does it take to find one such file?
You may simply be fighting with the I/O subsystem and the choice of script might not matter as much.
My suggestion is to baseline the single file removal to see if you can accomplish this task reasonably, or you may need to look at your hardware configuration.
Upvotes: 0
Reputation: 68303
For just getting the fullname strings from large directory structures, the legacy DIR command with the /B switch can be much faster:
cmd /c dir $path\9d3aa8ee-e60e-4b4f-9cd0-6678f8a5549e*.* /b /s /a-d
Upvotes: 1
Reputation: 126842
If I follow you, you're recursing over a huge directory for each file pattern you want to remove. If that's the case then I would find all patterns first and only then use a single Get-ChildItem call to remove the files.
$include = foreach( $i in table #1 )
{
foreach( $x in table #2 )
{
if(constraints are met)
{
output file pattern
}
}
}
Get-ChildItem -Path $path -Recurse -Include $include| Remove-Item -Force
Upvotes: 1