Keith Langmead
Keith Langmead

Reputation: 1157

Concatenating non-unique values alongside a unique field in PowerShell

I have the following set of queries which retrieve a set of data, merge it down so I just get the unique values and add a number to each of them (so I can select that particular item later).

$allMoveRequests = Get-MoveRequest -DomainController server | 
    select Alias,Status,TargetDatabase,BatchName 

$optNum=1
$AllMoveBatches = @($allMoveRequests | Sort-Object -Property BatchName | 
    Select-Object BatchName,TargetDatabase -Unique) |
    Select @{Name="Option";Expression={$optNum;$optNum++}},BatchName,TargetDatabase

$AllMoveBatches | Format-Table -AutoSize | Out-String|% {Write-Host $_}

This returns :

Option BatchName TargetDatabase
------ --------- --------------
1      Batch1    Database1
2      Batch2    Database2

etc. That works as it should, but what I'd like to add is the Status value from those batches, combined where there is more than one rather than creating duplicate entries. For instance, if I simply add Status into the second bit of code I end up with :

Option BatchName TargetDatabase Status
------ --------- -------------- ------
1      Batch1    Database1      Completed
2      Batch1    Database1      In Progress
3      Batch2    Database2      Completed

while what I'd ideally like would be :

Option BatchName TargetDatabase Status
------ --------- -------------- ------
1      Batch1    Database1      Completed,InProgress
2      Batch2    Database2      Completed

I've tried using an expression in the select statement to query all the relevant Status entries and apply -Unique to them, but that just returns all Status entries across all batches, not just those relevant to the current Batch line.

Is there a way to achieve this?

Upvotes: 1

Views: 251

Answers (2)

mclayton
mclayton

Reputation: 9975

It's not pretty, and it might not be very performant with lots of data, but here's one way to do it...

First, lets create some sample data:

$data = @(
    (new-object PSObject -Property ([ordered] @{
        "BatchName"      = "Batch1"
        "TargetDatabase" = "Database1"
        "Status"         = "Completed"
    })),
    (new-object PSObject -Property ([ordered] @{
        "BatchName"      = "Batch1"
        "TargetDatabase" = "Database1"
        "Status"         = "In Progress"
    })),
    (new-object PSObject  -Property ([ordered] @{
        "BatchName"      = "Batch2"
        "TargetDatabase" = "Database2"
        "Status"         = "Completed"
    }))
)

now, process it:

Set-Variable -Name "optNum" -Option AllScope -Value 1
$results = @( $data | group-object BatchName, TargetDatabase ) `
    | select-object @{Name="Option";Expression={$optNum; $optNum++}},
                    @{Name="BatchName";Expression={$_.Group[0].BatchName}},
                    @{Name="TargetDatabase";Expression={$_.Group[0].TargetDatabase}},
                    @{Name="Status";Expression={$_.Group.Status -join ", "}} `
    | sort-object -Property BatchName

and show the result:

PS> $results

Option BatchName TargetDatabase Status
------ --------- -------------- ------
     1 Batch1    Database1      Completed, In Progress
     2 Batch2    Database2      Completed

What it's doing is grouping to select the unique combinations of BatchName and DatabaseName and then to make the results it's selecting the BatchName and DatabaseName from the first item in each group, and concatenating all of the Status properties from the items in that group (you could also process the statuses in the Status Expression if you wanted to e.g. sort, filter or de-dupe them within each group).

Note that I've moved your original sort-object BatchName to the end of the pipeline. There's no point sorting, say 1000 objects only to throw half of them away - you might as well sort at the end.

And I could only get your "Option" counter to work by using Set-Variable to make it AllScope as the $optNum++ wasn't incrementing the variable properly when I used $optNum = 1 to initialise it.

Upvotes: 2

Rich Moss
Rich Moss

Reputation: 2384

mclayton's answer should be the accepted one, but here's a slightly more concise version that uses one of my favorite Powershell idioms: Foreach with a -Begin scriptblock {$i=1} that executes only once.

[pscustomobject]@{BatchName = 'Batch1';TargetDatabase='Database1';Status='Completed'},
[pscustomobject]@{BatchName = 'Batch1';TargetDatabase='Database1';Status='In Progress'},
[pscustomobject]@{BatchName = 'Batch2';TargetDatabase='Database2';Status='Completed'} |
    Group BatchName, TargetDatabase | 
    %{$i=1}{ [pscustomobject]@{Option = $i++ 
            BatchName = $_.Group[0].BatchName
            TargetDatabase = $_.Group[0].TargetDatabase
            Status = $_.Group.Status -join ','}
    } 

Upvotes: 1

Related Questions