Reputation: 300
I have a massive list of files whose names contain a number. On the other hand, I have a list of numbers. I need to find, using PowerShell (or any other Windows resource) the list of files that contain in their names any of the numbers from the other list.
I know how to find one by one using
Get-ChildItem | Where-Object {$_.Name -like "*123*"}
But I don't know how to search by the whole list without using the -or
operator.
Upvotes: 2
Views: 1566
Reputation: 13453
As js2010's helpful answer and mklement0 mention, we can exploit the string array in the Get-ChildItem
-Path
parameter to do our filtering. These are nice quick elegant solutions and would be great solutions for limited sets of strings.
The quirk comes in with @JBourne's comment when he mentions that he has hundreds of numbers to match. When we are dealing with hundreds of names to match with hundreds of filenames, these methods will all get exponentially slower. e.g. @Vish's very easy to understand answer demonstrates this. When you have, say, 100 numbers, and 1,000 files, you perform 100 x 1,000 = 100,000 evaluations. I assume that the internal code for Get-ChildItem
will do something similar when handling string[]
arrays on the input.
If we are interested in pure performance, we can't use arrays. Arrays are efficient for storing items, and accessing indexed locations, but are terrible for random querying. What we could use is a slightly more complicated method using Regex and Hashtables. Although Hashtables are a key/value system, and in this case we don't need a "value", they are highly efficient for finding and matching and querying large numbers of keys, typically with a "O(1)" level of success. e.g. our example goes from a O(n*f) problem to an O(n) problem, we only evaluate 1 x 1,000 = 1,000 evaluations.
To start with, we need our list of keys:
$FileWithListOfNumbers = @"
123 = Matched file with 123
456 = Matched file with 456
789 = Matched file with 789
"@
$KeyHashtable = ConvertFrom-StringData $FileWithListOfNumbers
This will load our hashtable with a list of keys. Next, we iterate through our files and use Regex for matching our filenames:
Get-ChildItem | % {
if($_.Name -match '\D*(\d+)\D*')
{
#Filename contains a number, perform a key lookup to see if it matches
if($KeyHashtable.ContainsKey($Matches[1]))
{
Write-Host $_.Name
}
}
}
By using Regex for matching (rather than a file system provider to filter) we can use match groups to "pull" out the number. You may have to adjust the Regex based on your specific needs and file naming convention, but it is:
-match '\D*(\d+)\D*'
\D* - Match 0 or more non-digits
( - Start of capture group
\d+ - Match 1 or more digits
) - End of capture group
\D* - Match 0 or more non-digits
That number we "pull" is stored in the special $Matches
variable in the second array location $Matches[1]
. We then perform a key lookup with the number to see if it matches anything we are looking for.
Upvotes: 0
Reputation: 27418
get-childitem *123*,*456*,*789*
Patterns from a file:
get-childitem -name | select-string (get-content patterns.txt)
Upvotes: 3
Reputation: 437062
An efficient approach is to use the regex-based -match
, the regular-expression matching operator with alternation (|
) to search for one of multiple patterns in a single operation:
$numbers = 42, 43, 44 # ...
Get-ChildItem | Where-Object Name -match ($numbers -join '|')
Alternatively, js2010's helpful answer shows that you can directly use Get-ChildItem
's (implied) -Path
parameter (whose type is [string[]]
, i.e., an array of paths), with an array of wildcard expressions:
$numbers = 42, 43, 44 # ...
Get-ChildItem ($numbers -replace '^|$', '*')
The above uses the -replace
operator to enclose each number in *...*
; that is, the above is the equivalent of:
Get-ChildItem *42*, *43*, *44*
Upvotes: 1
Reputation: 466
Try this:
$files = ( Get-ChildItem 'path' )
$numbers = 1 .. 100 # or your list contents
foreach( $n in $numbers ) {
foreach( $f in $files.BaseName ) {
if( $f -like "*$n*" ) {
"Found $f"
}
}
}
Upvotes: 0