AdilZ
AdilZ

Reputation: 1197

Powershell - How to filter array object with multiple keys/fields/columns of another array object EFFICIENTLY?

I am trying to get items in an array object(has all the data and way more rows) (array object 1) which also exists in a different array (Array object 2)(has some of the fields, and fewer rows)

How would I get the data from array object 1 using multiple fields since many fields repeat?

Important point: No single Entry in a field is unique, but combinations of 4 fields are

Specifically how to do this in a few steps/lines of code and without much time and/or resource consumption (I can figure out how to do this using loops but that's inefficient)

Example

ARRAY OBJECT 1

Number : 10
Place  : UT
Color  : yellow
Zone   : FAHIVT
Group  : 20


Number : 29
Place  : NY
Color  : white
Zone   : HWOUKK
Group  : 8


Number : 66
Place  : TX
Color  : black
Zone   : KZZGKI
Group  : 2


number : 127
Place  : AL
Color  : white
Zone   : DMXDZR
Group  : 14

ARRAY OBJECT 2

Place : NY
Color : white
Group : 2
Zone  : TVQJPN

Place : PA
Color : blue
Group : 4
Zone  : AAYYSN

Place : NJ
Color : red
Group : 17
Zone  : DXKSVE

The result I seek after matching or filtering is like

filtered Array Object 1

number : 1730
Place  : NY
Color  : white
Zone   : TVQJPN
Group  : 2



number : 2199
Place  : PA
Color  : blue
Zone   : AAYYSN
Group  : 4



number : 2746
Place  : NJ
Color  : red
Zone   : DXKSVE
Group  : 17

What I ended up doing was some "Feature Engineering" and combining the 4 fields that are unique as a combo into one string and added that as a property to each of the Array Objects. Combo = Place + Color + Zone + Group

Example

number : 1730
Place  : NY
Color  : white
Zone   : TVQJPN
Group  : 2
Combo  : NYwhiteTVQJPN2

Assume $wholearray is the bigger more fuller array and $partialobject is the one smaller one we need to filter by.

The Following is my CURRENT CODE

    $wholearray.ForEach({
        $thisline = $_ ;
        $combo = $thisline.Place + $thisline.Color + $thisline.Zone + $thisline.Group;
        $wholearray.Where({$_ -eq $thisline}) | Add-Member -NotePropertyName Combo -NotePropertyValue $combo
    }

    $partialobject.ForEach({
        $thisline = $_ ;
        $combo = $thisline.Place + $thisline.Color + $thisline.Zone + $thisline.Group;
        $partialobject.Where({$_ -eq $thisline}) | Add-Member -NotePropertyName Combo -NotePropertyValue $combo
    }

    $filtereddata = $wholearray.Where({$_.Combo -in $($partialobject.Combo)}) 

It works with an average processing time of 0.08secs per instance in the loop But its still Slow. It will take 5 mins just for a couple of thousand rows/instances. When I have to do this with 20k or more it would be a disaster. All The while it could be done in 15 secs with even more rows in Excel using vlookup. And POSH or CLI should be faster than Excel.

How can I make this more efficient?

How can I make this faster?

Upvotes: 0

Views: 2682

Answers (2)

user6811411
user6811411

Reputation:

A single Compare-Object could accomplish your task.

As Compare-Object isn't known for speed you've got at least an alternative to measure time.

$Filtered = Compare-Object -Ref $WholeArray -Diff $PartialArray `
                           -Property Place,Color,Zone,Group `
                           -IncludeEqual -ExcludeDifferent -PassThru

To remove the inserted property SideIndicator you might do:

$Data = $Filtered | Select-Object -Property * -ExcludeProperty SideIndicator

Upvotes: 1

Theo
Theo

Reputation: 61218

Perhaps using this code and a lookup Hashtable will speed things up for you:

Add the Combo property to the $wholearray objects:

$wholearray | ForEach-Object {
    $combo = '{0}{1}{2}{3}' -f $_.Place, $_.Color, $_.Zone, $_.Group
    $_ | Add-Member -MemberType NoteProperty -Name Combo -Value $combo
}

For the objects in the partial array, you create a Hashtable object:

$lookup = @{}

$partialarray | ForEach-Object {
    $combo = '{0}{1}{2}{3}' -f $_.Place, $_.Color, $_.Zone, $_.Group
    $lookup[$combo] = $true   # the key is the important thing here, the value doesn't matter
}

Then get your filtered data using:

$filtereddata = $wholearray | Where-Object { $lookup.ContainsKey($_.Combo)} 

Hope that helps

Upvotes: 1

Related Questions