JBourne
JBourne

Reputation: 300

How do I find all files in a folder whose names contain words from a list?

I have a massive list of files whose names contain a number. On the other hand, I have a list of numbers. I need to find, using PowerShell (or any other Windows resource) the list of files that contain in their names any of the numbers from the other list.

I know how to find one by one using

Get-ChildItem | Where-Object {$_.Name -like "*123*"}

But I don't know how to search by the whole list without using the -or operator.

Upvotes: 2

Views: 1566

Answers (4)

HAL9256
HAL9256

Reputation: 13453

As js2010's helpful answer and mklement0 mention, we can exploit the string array in the Get-ChildItem -Path parameter to do our filtering. These are nice quick elegant solutions and would be great solutions for limited sets of strings.

The quirk comes in with @JBourne's comment when he mentions that he has hundreds of numbers to match. When we are dealing with hundreds of names to match with hundreds of filenames, these methods will all get exponentially slower. e.g. @Vish's very easy to understand answer demonstrates this. When you have, say, 100 numbers, and 1,000 files, you perform 100 x 1,000 = 100,000 evaluations. I assume that the internal code for Get-ChildItem will do something similar when handling string[] arrays on the input.

If we are interested in pure performance, we can't use arrays. Arrays are efficient for storing items, and accessing indexed locations, but are terrible for random querying. What we could use is a slightly more complicated method using Regex and Hashtables. Although Hashtables are a key/value system, and in this case we don't need a "value", they are highly efficient for finding and matching and querying large numbers of keys, typically with a "O(1)" level of success. e.g. our example goes from a O(n*f) problem to an O(n) problem, we only evaluate 1 x 1,000 = 1,000 evaluations.

To start with, we need our list of keys:

$FileWithListOfNumbers = @"
123 = Matched file with 123
456 = Matched file with 456
789 = Matched file with 789
"@

$KeyHashtable = ConvertFrom-StringData $FileWithListOfNumbers

This will load our hashtable with a list of keys. Next, we iterate through our files and use Regex for matching our filenames:

Get-ChildItem | % {
    if($_.Name -match '\D*(\d+)\D*')
    {
        #Filename contains a number, perform a key lookup to see if it matches
        if($KeyHashtable.ContainsKey($Matches[1]))
        {
            Write-Host $_.Name
        }
    }
}

By using Regex for matching (rather than a file system provider to filter) we can use match groups to "pull" out the number. You may have to adjust the Regex based on your specific needs and file naming convention, but it is:

-match '\D*(\d+)\D*'

\D*    - Match 0 or more non-digits
 (     - Start of capture group
  \d+  - Match 1 or more digits
 )     - End of capture group
\D*    - Match 0 or more non-digits

That number we "pull" is stored in the special $Matches variable in the second array location $Matches[1]. We then perform a key lookup with the number to see if it matches anything we are looking for.

Upvotes: 0

js2010
js2010

Reputation: 27418

get-childitem *123*,*456*,*789*

Patterns from a file:

get-childitem -name | select-string (get-content patterns.txt)

Upvotes: 3

mklement0
mklement0

Reputation: 437062

An efficient approach is to use the regex-based -match, the regular-expression matching operator with alternation (|) to search for one of multiple patterns in a single operation:

$numbers = 42, 43, 44 # ...
Get-ChildItem | Where-Object Name -match ($numbers -join '|')

Alternatively, js2010's helpful answer shows that you can directly use Get-ChildItem's (implied) -Path parameter (whose type is [string[]], i.e., an array of paths), with an array of wildcard expressions:

$numbers = 42, 43, 44 # ...
Get-ChildItem ($numbers -replace '^|$', '*')

The above uses the -replace operator to enclose each number in *...*; that is, the above is the equivalent of:

Get-ChildItem *42*, *43*, *44*

Upvotes: 1

Vish
Vish

Reputation: 466

Try this:

$files = ( Get-ChildItem 'path' )

$numbers = 1 .. 100 # or your list contents

foreach( $n in $numbers ) {
    foreach( $f in $files.BaseName ) {
        if( $f -like "*$n*" ) {
            "Found $f"
        }
    }
}

Upvotes: 0

Related Questions