Fr3nZy
Fr3nZy

Reputation: 65

Sort an array numerically by extracting integers in names

$Folders = Get-ChildItem -LiteralPath $PSScriptRoot | Where-Object {$_.PSIsContainer} | Select-Object -ExpandProperty BaseName

I get the output

Set 1
Set 10
Set 11 - A Memo
Set 2
Set 20
Set 22 - A Memo With Numbers 1234
Set 3
Set 33 - A Memo
...

$Folders = $Folders | Sort-Object {[INT]($_ -Replace 'Set ', '')} will sort the names in the right order but doesn't work if there is anything after the number like ' - A Memo'.

I've tried \b\d+\b on https://regexr.com but don't know how to implement that in this case. I need a regex that can extract the number after 'Set ' and discard everything else.

RegEx is a whole other language in itself

Upvotes: 0

Views: 702

Answers (2)

zett42
zett42

Reputation: 27756

Some alternatives for extracting the number, complementing g.sulman's excellent answer.

First the most simplest method, assuming "Set" and the number are always separated by space:

$Folders | Sort-Object { [int]($_ -split ' ')[1] }

This uses the -split operator to split the string on space character, which returns an array. Then it converts the 2nd element to int.


Use -match operator:

$Folders | Sort-Object { [int]( $_ -match '\d+' ? $matches[0] : 0 ) }

Note that conditional operator ? requires PS 7. Alternative for older PS versions:

$Folders | Sort-Object { [int]( if( $_ -match '\d+' ){ $matches[0] } else { 0 } ) }

The -match operator finds the first sub string that matches the RegEx \d+ which stands for one or more digits. The found sub string can be accessed through $matches[0].


Use Select-String cmdlet:

$Folders | Sort-Object { [int] ( $_ | Select-String -Pattern \d+ ).Matches[0].Value }

Same principle as the -match method. Just a different way to access the found sub string.

Upvotes: 3

G42
G42

Reputation: 10019

$names = @"
Set 1
Set 10
Set 11 - A Memo
Set 2
Set 20
Set 22 - A Memo With Numbers 1234
Set 3
Set 33 - A Memo
"@ -split "`n"

$names | sort @{expression={[int]($_ -replace '^\w+\s|\s.+')}}

You can use an expression with Sort-Object. Above this is done to replace everything you don't care about and convert to int for number sorting (in text sorting 1, 10, 11, 2, 20 ... is expected.)

Regex breakdown

^  - start of the string
\w - word character (matches S)
+  - the previous thing as many times as need (matches Se, Set, Seet, Seeeeeeeet)
\s - space
|  - or. so either everything before this, or everything after
\s - space
.  - any character
+  - I think this one's covered above

Note: + matches 1 or more. Use * if you need to match 0 or more.

Edit: As per zett42's helpful comment, you could use [int]($_ -split ' ')[1] in the Sort-Object expression. This splits your name into an array, and takes the 2nd element of that array.

Upvotes: 2

Related Questions