Reputation: 167
Im currently trying to multithread the inventory of 5000 groups 30.000+ users; I took this information offline cause I don't want to query our service provider over the internet for each group membership; as the information is 'somewhat' available in a property when retrieving the groups - that aside.
So I have 2 cliXML files that I import into variables, memory usage on the server: 4GB+ using one ForEach loop to tie both arrays together takes roughly 30 seconds per group, so I with to use Jobs - but there comes my problem... each job would require the entire copy of both XML files to do their lookups, so each additional job causes the memory to be filled with 4GB extra..
I wanted to use something like databases (like sqlite) but the data properties of PowerShell get lost cause it doesnt support the rich object-oriented columns Powershell does...
Write-Info "Starting jobs..."
$Start = Get-Date
For ($i = 0; $i -lt $runs; $i++) {
$currentBatch = $MailSecurityGroups | Select-Object -First $BatchSize -Skip $CurrentBatchStart
$CurrentBatchStart += $BatchSize
$CurrentBatchStart = 0
#Limit Batches to 75 per time...
while ((Get-Job | Where-object State -eq "Running").count -gt 5) {
write-Info "Waiting to start next job..."
$prev = Get-Status -prev $prev
Start-Sleep -Seconds 10
}
start-sleep -seconds (get-random -Maximum 5 -Minimum 1)
Write-Info "Starting job ($i)..."
Start-Job {
param([array]$CurrentBatch)
Start-Transcript "$($using:WorkPath)\GroupExport\BatchOutput_Batch$($using:i).log"
Write-Host $using:MailSecurityGroups.Count
Write-Host $using:MailEnabledUsers.Count
Foreach ($Group in $CurrentBatch) {
$groupGUID = $Group.GUID.Guid
[Array]$GroupMembers = @()
$CurrentGroupMembersExportCSV = "$($using:WorkPath)\GroupExport\$($groupGUID).csv"
Write-Host "Processing [$($groupGUID)] - $($Group.DisplayName)"
foreach ($Member in $Group.Members) {
Write-Host "Processing $($member)"
$FoundGroup = ($using:MailSecurityGroups | Where-Object {$_.Name -eq $Member.User -or $_.Name -eq $Member.Name -or $_.Name -eq $Member} )
if ($FoundGroup) {
$CurrentObject = $FoundGroup | Select-Object *, @{N="ObjectType";E={"Group"}}
Write-Host "Found a group!"
} else {
$FoundUser = $using:MailEnabledUsers | Where-Object {$_.UserPrincipalName -eq $Member.User -or $_.Identity -eq $Member}
if ($FoundUser) {
$CurrentObject = $FoundUser | Select-Object *, @{N="ObjectType";E={"User"}}
Write-Host "Found a User!"
} else {
Write-Host "Found Nothing.."
continue
}
}
Write-Host "$($CurrentObject.ObjectType)"
if ($CurrentObject.ObjectType -eq "User") {
Write-Host "User: GUID: [$($CurrentObject.GUID.Guid)] - ($($CurrentObject.Name))"
$UserObject = [PSCustomObject]@{
PrimarySMTPAddress = $CurrentObject.UserPrincipalName
GUID = $CurrentObject.Guid.Guid
MemberIdentity = $Member
}
$GroupMembers += $UserObject
} else {
#Kick it out of the query, we will look at this later.
$NestedIssue = "$($using:WorkPath)\GroupExport\NESTEDGROUP_$($groupGUID).Csv"
Write-Host $NestedIssue
$CurrentObject | Export-Csv $NestedIssue -Append -NoTypeInformation -Delimiter ";"
}
$GroupMembers | Export-CSV $CurrentGroupMembersExportCSV -Delimiter ";" -NoTypeInformation
$Group | Select-Object Guid |export-CSV "$($using:WorkPath)\GroupExport\zz_processed_Batch$($using:i).csv" -Append -Delimiter ";"
}
}
Stop-Transcript
} -ArgumentList (,$currentBatch) -Name "zz_Batch$($i)"
}
as noted above, I tried to pass the Variables along with $using: and I also tried to import the XML into each job, but the issue is the size of them.. I require some sort of 'centrally queryable variable' that is stored in memory just once...
Upvotes: 1
Views: 71
Reputation: 23830
As described in PowerShell scripting performance considerations, wrapping cmdlets as e.g. the Export-Csv
cmdlet, might get pretty expensive. To avoid this, you might want to keep the Csv
files open by creating multiple pipelines.
Unfortunately, I can't completely simulate your environment but to achieve this, your script should look something like this:
$MailSecurityGroups | Foreach-Object -Begin {
$CsvExports = @{}
function ExportCsv($Path, $Object) {
if(-not $CsvExports.Contains($Path)) { # Open a new pipeline (file)
$CsvExports[$Path] = {
Export-CSV -Path $Path -NoTypeInformation -Delimiter ";"
}.GetSteppablePipeline()
$CsvExports[$Path].Begin($true)
}
$CsvExports[$Path].Process($Object) # Export the object
}
} -Process {
$Group = $_
Function Write-Info {
param ($text)
Write-Host "[$(get-date)] $text"
}
Write-Info "Processing [$($Group.GUID.Guid)] - $($Group.DisplayName)"
#Pass the variables along
# $MailSecurityGroups = $using:MailSecurityGroups
# $MailEnabledUsers = $using:MailEnabledUsers
# $WorkPath = $using:WorkPath
$CurrentGroupMembersExportCSV = "$($using:WorkPath)\GroupExport\$($Group.Guid.Guid).csv"
$Group.Members | Foreach-Object {
$Member = $_
Function Write-Info {
param ($text)
Write-Host "[$(get-date)] $text"
}
Function Get-ObjectType {
param ($PermissionObject)
$FoundGroup = ($MailSecurityGroups | Where-Object {$_.Name -eq $PermissionObject.User -or $_.Name -eq $PermissionObject.Name -or $_.Name -eq $PermissionObject} )
if ($FoundGroup) {
return $FoundGroup | Select-Object *, @{N="ObjectType";E={"Group"}}
}
$FoundUser = $MailEnabledUsers | Where-Object {$_.UserPrincipalName -eq $PermissionObject.User -or $_.Identity -eq $PermissionObject}
if ($FoundUser) {
return $FoundUser | Select-Object *, @{N="ObjectType";E={"User"}}
}
}
Write-Info "Processing $($Member)"
$CurrentObject = Get-ObjectType $Member
if ($CurrentObject.ObjectType -eq "User") {
Write-Info "User: GUID: [$($CurrentObject.GUID)] - ($($CurrentObject.Name))"
$UserObject = [PSCustomObject]@{
PrimarySMTPAddress = $CurrentObject.UserPrincipalName
GUID = $CurrentObject.Guid
MemberIdentity = $Member
}
# $UserObject | Export-CSV $using:CurrentGroupMembersExportCSV -Delimiter ";" -NoTypeInformation -Append
ExportCsv -Path $CurrentGroupMembersExportCSV -Object $UserObject
} else { $CurrentGroupMembersExportCSV
$NestedIssue = "$WorkPath\GroupExport\NESTEDGROUP_$($Group.GUID.Guid).Csv"
Write-Host $NestedIssue
# $CurrentObject | Export-Csv $NestedIssue -Append -NoTypeInformation -Delimiter ";"
ExportCsv -Path $NestedIssue -Object $CurrentObject
}
}
} -End {
$CsvExports.Values.foreach{ $_.End() } # Close all pipelines (files)
}
For more background, see: Mastering the (steppable) pipeline
Upvotes: 0
Reputation: 167
Based on the comment of @mklement0 I've updated my code as below, using the v7+ ForEach-Object -Parallel
feature instead of background jobs to achieve parallelism:
$MailSecurityGroups | Foreach-Object -Parallel {
$Group = $_
Function Write-Info {
param ($text)
Write-Host "[$(get-date)] $text"
}
Write-Info "Processing [$($Group.GUID.Guid)] - $($Group.DisplayName)"
#Pass the variables along
$MailSecurityGroups = $using:MailSecurityGroups
$MailEnabledUsers = $using:MailEnabledUsers
$WorkPath = $using:WorkPath
$CurrentGroupMembersExportCSV = "$($using:WorkPath)\GroupExport\$($Group.Guid.Guid).csv"
$Group.Members | Foreach-Object -Parallel {
$Member = $_
Function Write-Info {
param ($text)
Write-Host "[$(get-date)] $text"
}
Function Get-ObjectType {
param ($PermissionObject)
$FoundGroup = ($using:MailSecurityGroups | Where-Object {$_.Name -eq $PermissionObject.User -or $_.Name -eq $PermissionObject.Name -or $_.Name -eq $PermissionObject} )
if ($FoundGroup) {
return $FoundGroup | Select-Object *, @{N="ObjectType";E={"Group"}}
}
$FoundUser = $using:MailEnabledUsers | Where-Object {$_.UserPrincipalName -eq $PermissionObject.User -or $_.Identity -eq $PermissionObject}
if ($FoundUser) {
return $FoundUser | Select-Object *, @{N="ObjectType";E={"User"}}
}
}
Write-Info "Processing $($Member)"
$CurrentObject = Get-ObjectType $Member
if ($CurrentObject.ObjectType -eq "User") {
Write-Info "User: GUID: [$($CurrentObject.GUID)] - ($($CurrentObject.Name))"
$UserObject = [PSCustomObject]@{
PrimarySMTPAddress = $CurrentObject.UserPrincipalName
GUID = $CurrentObject.Guid
MemberIdentity = $Member
}
$UserObject | Export-CSV $using:CurrentGroupMembersExportCSV -Delimiter ";" -NoTypeInformation -Append
} else {
$NestedIssue = "$($using:WorkPath)\GroupExport\NESTEDGROUP_$($Group.GUID.Guid).Csv"
Write-Host $NestedIssue
$CurrentObject | Export-Csv $NestedIssue -Append -NoTypeInformation -Delimiter ";"
}
} -ThrottleLimit 5
} -ThrottleLimit 10
needs some playing with the throttleLimits, but it seems way faster and no additional memory is being consumed!
Upvotes: 1