Reputation: 11995

powershell filter recordset from csv file

Say I have a simple csv file. The model contains 4 fields id, name, group, and sub. the group field has integer values between 1 and 15 and it can repeat.

So what I am trying to do is select all records not in groups 7 and 9...I've written the below code:

$csv = Import-Csv -Path names.csv -Header @('id', 'name', 'group', 'sub')
$total = $csv | select group -unique
write-host "total:", $total.length
$ignore = $csv | where { $_.sub -eq 'a'}
write-host "total to ignore:", $ignore.length
$ignoreGroups = $ignore | select group -unique
write-host "Groups to ignore:", $ignoreGroups.length

$workingGroups = $csv | where {$ignore -notcontains $_ } | select group -unique

write-host "Working groups count:", $workingGroups.Length

The previous line reports a wrong result.

My goal would be to process the records belonging to $csv one group at a time (i.e group in $workingGroups).

Ps: This is only a model, not the real data. The real data is a huge log file I have to process.

Sample Data: https://gist.github.com/deostroll/4ced74eef461de61f477

Edit: I am trying to select the distict group values using select-object, but what I am getting is an array of objects (each object with group property). What do I do to get distinct group values alone as an array of integers...?

Upvotes: 1

Answers (3)

deostroll

Reputation: 11995

Alternate way: Use -ExpandProperty to get a pure array instead of Object[]

$workingGroups = $csv | where {$_.sub -ne 'a'} | select -ExpandProperty group -unique

foreach($group in $workingGroups) {
    $recordset = $csv | where { $_.group -eq $group }
    #do something with $recordset
}

Upvotes: 0

Jan Chrbolka

Reputation: 4464

This is my understanding...

Requirements

Process CSV data
Exclude all records where sub = "a"
Exclude groups 7 & 9
Group remaining data by Group for processing

Code

$WorkingList = $csv | where { ($_.sub -ne "a") -and (@(9,7) -notcontains $_.group)}
$WorkingGroups = $WorkingList | group-object group

and this is what you sample data ends up looking like

PS > $WorkingGroups

Count Name                      Group                                                                                                                                                                                
----- ----                      -----                                                                                                                                                                                
    2 2                         {@{id=5; name=Carlene Preston; group=2; sub=b}, @{id=9; name=Mccullough Fisher; group=2; sub=b}}                                                                                     
    2 3                         {@{id=10; name=Tara Gallagher; group=3; sub=b}, @{id=12; name=Morton Whitley; group=3; sub=b}}                                                                                       
    1 4                         {@{id=4; name=Shannon Donovan; group=4; sub=b}}                                                                                                                                      
    1 5                         {@{id=1; name=Chandler Cox; group=5; sub=b}}                                                                                                                                         
    1 8                         {@{id=3; name=Brown Gardner; group=8; sub=b}}                                                                                                                                        
    3 11                        {@{id=0; name=Eleanor Whitney; group=11; sub=b}, @{id=2; name=Georgette Simpson; group=11; sub=b}, @{id=7; name=Holland Mccarty; group=11; sub=b}}                                   
    1 12                        {@{id=14; name=Butler Hale; group=12; sub=b}}                                                                                                                                        
    1 14                        {@{id=8; name=Larson Larson; group=14; sub=b}}

Is this close to what you were after?

EDIT:

To process the data simply, iterate through groups and items.

foreach ($group in $WorkingGroups){
    write-host "Processing Group: $($group.Name)"
    foreach ($item in $group.Group){
        write-host "`tProcessing Item: $($item.Name)"
    }
}

Result

Processing Group: 2
    Processing Item: Carlene Preston
    Processing Item: Mccullough Fisher
Processing Group: 3
    Processing Item: Tara Gallagher
    Processing Item: Morton Whitley
Processing Group: 4
    Processing Item: Shannon Donovan
Processing Group: 5
    Processing Item: Chandler Cox
Processing Group: 8
    Processing Item: Brown Gardner
Processing Group: 11
    Processing Item: Eleanor Whitney
    Processing Item: Georgette Simpson
    Processing Item: Holland Mccarty
Processing Group: 12
    Processing Item: Butler Hale
Processing Group: 14
    Processing Item: Larson Larson

EDIT2

An array of groups is already built-in to the data. To access an array of groups:

$WorkingGroups.Name

Upvotes: 1

David Brabant

Reputation: 43589

"So what I am trying to do is select all records not in groups 7 and 9"

$csv | ?{ $_.group -ne 7 -and $_.group -ne 9 }

I really don't see what your code attempts to do.

After your comment:

$groups = $csv | ?{ $_.sub -ne "a" } | group-object group

You can then access groups this way:

$groups | %{ $_.Name; $_.Group | %{ $_ /*Do whatever you want with each group member here */} }

Upvotes: 1

powershell filter recordset from csv file

Answers (3)

Related Questions