Reputation: 415
I have some array with lot of rows (thousands and more). It have columns like 'ColumnXXXXX'
Row example:
Group_name = "proxy_users"
Column1 = "domain\Igor"
Column2 = null
.......
Column989 = 'domain\Andrew'
Column999 = 'domain\Mike'
What is the right and FAST way to create new variable that will be the sum of 'ColumnXXX' ignoring 'null' values?
Like "domain\igor, domain\Andrew, domain\mike"
I can use smth like $group | select -Property "Column*"...but how to sum and how to ignore null ?
Upvotes: 0
Views: 818
Reputation: 440132
To complement Frode F.'s elegant ScriptProperty-based solution with a more concise PSv4+ version that also performs better due to avoiding the use of cmdlets (a pipeline) in the property-definition script block in favor of operators (with a large number of input objects, this may matter):
$o | Add-Member -PassThru -MemberType ScriptProperty -Name Joined -Value {
@($this.psobject.Properties.Where({$_.Name -match 'Column*'}).Value) -ne 'null' -join ', '
}
Note the use of:
The .Where()
collection operator (PSv4+), a faster alternative to the Where-Object
cmdlet.
Member-access enumeration (PSv3+), where accessing a property at the collection level yields an array of the elements' property values;
e.g., $this.psobject.Properties.Name
yields the .Name
property values of all elements of the $this.psobject.Properties
collection.
Applying comparison operator -ne
to an array-valued LHS, in which case the operator acts as a filter: the operator is applied to each element, and the matching elements are returned as an array; note the @(...)
around the LHS, which ensures that it is treated as an array even if it happens to return only a single value.
With the sample data in the question, the above yields (look for property Joined
):
Group_name : proxy_users
Column1 : domain\Igor
Column2 : null
Column989 : domain\Andrew
Column999 : domain\Mike
Joined : domain\Igor, domain\Andrew, domain\Mike
With the above optimizations, you could even consider a simpler Select-Object
solution that constructs new custom-object instances that contain all of the input object's properties (*
) plus a new calculated property that contains the combined column values of interest:
$o | Select-Object *, @{ n='Joined'; e={
@(($_.psobject.Properties.Where({$_.Name -match 'Column*'})).Value) -ne 'null' -join ', ' }
}
This solution is slower than the one above, but not by much.
A caveat is that if you collect all output in memory, the newly constructed objects take up space in addition to the originals.
The code at the bottom times the various approaches from Frode's and my answer.
Here's a sample timing from my machine - the absolute numbers aren't important, but their ratios are (though I don't know how factors such as CPU core count and disk speed would come into play) - input set size is 1000
objects:
Approach TotalSeconds
-------- ------------
add-member w/ scriptproperty - operators 0.35
select-object w/ noteproperty 0.40
add-member w/ scriptproperty - pipeline 0.72
add-member w/ noteproperty - pipeline 0.98
Conclusions:
The solutions that use pipelines in the property definition are noticeably slower, by a factor of about 2, which doesn't seem to vary even with larger input counts.
The optimized Select-Object
-based solution that creates new objects (rather than adding properties to the input objects) is only slightly slower than the optimized Add-Member
solution.
Source code for the tests:
# The number of input objects to test with.
# Adjust to experiment.
$objCount = 1000
Write-Verbose -Verbose "# of input objects: $objCount"
$ndx = 0; $approaches = 'select-object w/ noteproperty',
'add-member w/ scriptproperty - operators',
'add-member w/ scriptproperty - pipeline',
'add-member w/ noteproperty - pipeline'
$tempFile = New-TemporaryFile # requires PSv5+
$(foreach($approach in $approaches) {
# Create test collection (recreate it in every iteration to support repeated Add-Member operations)
$al = [System.Collections.ArrayList]::new()
for ($i = 0; $i -lt $objCount; ++$i) {
$null = $al.add([pscustomobject] @{ one = 1; Column1 = 'col1'; Column2 = 'col2'; Column3 = 'null'; Column4 = 'col4' })
}
Measure-Command {
Write-Verbose -Verbose "Running: $approach..."
switch ($ndx) {
0 { # select-object w/ noteproperty
$al | Select-Object *, @{ n='Joined'; e={ @(($_.psobject.Properties.Where({ $_.Name -match 'Column*'})).Value) -ne 'null' -join ', ' } } |
Export-Csv $tempFile
break
}
1 { # add-member w/ scriptproperty - operators
$al | Add-Member -PassThru -MemberType ScriptProperty -Name Joined -Value { @($this.psobject.Properties.Where({ $_.Name -match 'Column*'}).Value) -ne 'null' -join ', ' } |
Export-Csv $tempFile
break
}
2 { # add-member w/ scriptproperty - pipeline
$al | Add-Member -PassThru -MemberType ScriptProperty -Name Joined -Value { ($this.psobject.Properties | Where-Object { $_.Name -match 'Column*' -and $_.Value -ne 'null' } | Select-Object -ExpandProperty Value) -join ', ' } |
Export-Csv $tempFile
break
}
3 { # add-member w/ noteproperty - pipeline; requires an intermediate ForEach-Object call
$al | ForEach-Object {
$_ | Add-Member -PassThru -NotePropertyName Joined -NotePropertyValue (($_.psobject.Properties | Where-Object { $_.Name -match 'Column*' -and $_.Value -ne 'null' } | Select-Object -ExpandProperty Value) -join ', ' )
} |
Export-Csv $tempFile
break
}
default { Throw "What are you doing here?" }
}
# Import-Csv $tempFile | Select-Object -First 1 Joined | Write-Verbose -Verbose
++$ndx
} | Select-Object @{ n='Approach'; e={ $approach }}, @{ n='TotalSeconds'; e={ '{0:N2}' -f $_.TotalSeconds } }
}) | Sort-Object { [double] $_.TotalSeconds }
Remove-Item $tempFile
Upvotes: 0
Reputation: 54971
You can list all properties using ex. $_.psobject.properties
, filter out the ones you want and use -join
to combine the values. Ex
$o = [pscustomobject]@{
Group_Name = "Test"
Column1 = "Foo"
Column2 = $null
Column3 = "1"
}, [pscustomobject]@{
Group_Name = "Test2"
Column1 = $null
Column2 = "Bar"
Column3 = "2"
}
#Add property as constant
$o | ForEach-Object {
$_ | Add-Member -NotePropertyName Summed -NotePropertyValue (($_.psobject.Properties | Where-Object { $_.Name -ne 'Group_name' -and $_.Value -ne 'null' } | Select-Object -ExpandProperty Value) -join '' )
}
$o | ft
Or you can use a ScriptProperty to calculate the value on every call
#Remember to exclude itself to avoid infinite recursion
$o | Add-Member -MemberType ScriptProperty -Name Summed -Value {($this.psobject.Properties | Where-Object { $_.Name -ne 'Group_name' -and $_.Name -ne 'Summed' -and $_.Value -ne 'null' } | Select-Object -ExpandProperty Value) -join '' }
$o | ft
Result:
Group_Name Column1 Column2 Column3 Summed
---------- ------- ------- ------- ------
Test Foo 1 Foo1
Test2 Bar 2 Bar2
As an alternative, you can use $_.Name -like 'Column*' -and $_.Value -ne 'null'
as the property-filter if they are actually called ColumnXXX
.
Upvotes: 2