Reputation: 1197
I am trying to get items in an array object(has all the data and way more rows) (array object 1) which also exists in a different array (Array object 2)(has some of the fields, and fewer rows)
How would I get the data from array object 1 using multiple fields since many fields repeat?
Important point: No single Entry in a field is unique, but combinations of 4 fields are
Specifically how to do this in a few steps/lines of code and without much time and/or resource consumption (I can figure out how to do this using loops but that's inefficient)
Example
ARRAY OBJECT 1
Number : 10
Place : UT
Color : yellow
Zone : FAHIVT
Group : 20
Number : 29
Place : NY
Color : white
Zone : HWOUKK
Group : 8
Number : 66
Place : TX
Color : black
Zone : KZZGKI
Group : 2
number : 127
Place : AL
Color : white
Zone : DMXDZR
Group : 14
ARRAY OBJECT 2
Place : NY
Color : white
Group : 2
Zone : TVQJPN
Place : PA
Color : blue
Group : 4
Zone : AAYYSN
Place : NJ
Color : red
Group : 17
Zone : DXKSVE
The result I seek after matching or filtering is like
filtered Array Object 1
number : 1730
Place : NY
Color : white
Zone : TVQJPN
Group : 2
number : 2199
Place : PA
Color : blue
Zone : AAYYSN
Group : 4
number : 2746
Place : NJ
Color : red
Zone : DXKSVE
Group : 17
What I ended up doing was some "Feature Engineering" and combining the 4 fields that are unique as a combo into one string and added that as a property to each of the Array Objects. Combo = Place + Color + Zone + Group
Example
number : 1730
Place : NY
Color : white
Zone : TVQJPN
Group : 2
Combo : NYwhiteTVQJPN2
Assume $wholearray
is the bigger more fuller array and $partialobject
is the one smaller one we need to filter by.
The Following is my CURRENT CODE
$wholearray.ForEach({
$thisline = $_ ;
$combo = $thisline.Place + $thisline.Color + $thisline.Zone + $thisline.Group;
$wholearray.Where({$_ -eq $thisline}) | Add-Member -NotePropertyName Combo -NotePropertyValue $combo
}
$partialobject.ForEach({
$thisline = $_ ;
$combo = $thisline.Place + $thisline.Color + $thisline.Zone + $thisline.Group;
$partialobject.Where({$_ -eq $thisline}) | Add-Member -NotePropertyName Combo -NotePropertyValue $combo
}
$filtereddata = $wholearray.Where({$_.Combo -in $($partialobject.Combo)})
It works with an average processing time of 0.08secs per instance in the loop But its still Slow. It will take 5 mins just for a couple of thousand rows/instances. When I have to do this with 20k or more it would be a disaster. All The while it could be done in 15 secs with even more rows in Excel using vlookup. And POSH or CLI should be faster than Excel.
How can I make this more efficient?
How can I make this faster?
Upvotes: 0
Views: 2682
Reputation:
A single Compare-Object
could accomplish your task.
As Compare-Object
isn't known for speed you've got at least an alternative to measure time.
$Filtered = Compare-Object -Ref $WholeArray -Diff $PartialArray `
-Property Place,Color,Zone,Group `
-IncludeEqual -ExcludeDifferent -PassThru
To remove the inserted property SideIndicator
you might do:
$Data = $Filtered | Select-Object -Property * -ExcludeProperty SideIndicator
Upvotes: 1
Reputation: 61218
Perhaps using this code and a lookup Hashtable will speed things up for you:
Add the Combo
property to the $wholearray
objects:
$wholearray | ForEach-Object {
$combo = '{0}{1}{2}{3}' -f $_.Place, $_.Color, $_.Zone, $_.Group
$_ | Add-Member -MemberType NoteProperty -Name Combo -Value $combo
}
For the objects in the partial array, you create a Hashtable object:
$lookup = @{}
$partialarray | ForEach-Object {
$combo = '{0}{1}{2}{3}' -f $_.Place, $_.Color, $_.Zone, $_.Group
$lookup[$combo] = $true # the key is the important thing here, the value doesn't matter
}
Then get your filtered data using:
$filtereddata = $wholearray | Where-Object { $lookup.ContainsKey($_.Combo)}
Hope that helps
Upvotes: 1