Reputation: 3
I have tried several different variations based on some other stack overflow articles, but I will share a sample of what I have and a sample output and then some cobbled-together code hoping for some direction from the community:
C:\Scripts\contacts.csv:
id,first_name,last_name,email
1,john,smith,[email protected]
1,jane,smith,[email protected]
2,jane,smith,[email protected]
2,john,smith,[email protected]
3,sam,jones,[email protected]
3,sandy,jones,[email protected]
Need to turn this into a file where column "email" is unique to column "id". In other words there can be duplicate addresses, but only if there is a different id.
desired output C:\Scripts\contacts-trimmed.csv:
id,first_name,last_name,email
1,john,smith,[email protected]
2,john,smith,[email protected]
3,sam,jones,[email protected]
3,sandy,jones,[email protected]
I have tried this with a few different variations:
Import-Csv C:\Scripts\contacts.csv | sort first_name | Sort-Object -Property id,email -Unique | Export-Csv C:\Scripts\contacts-trim.csv -NoTypeInformation
Any help or direction would be most appreciated
Upvotes: 0
Views: 740
Reputation: 174545
You'll want to use the Group-Object
cmdlet, to, well, group together records with similar values:
$records = @'
id,first_name,last_name,email
1,john,smith,[email protected]
1,jane,smith,[email protected]
2,jane,smith,[email protected]
2,john,smith,[email protected]
3,sam,jones,[email protected]
3,sandy,jones,[email protected]
'@ |ConvertFrom-Csv
# group records based on id and email column
$records |Group-Object id,email |ForEach-Object {
# grab only the first record from each group
$_.Group |Select-Object -First 1
} |Export-Csv .\no_duplicates.csv -NoTypeInformation
Upvotes: 1