shadow2020
shadow2020

Reputation: 1351

Filter list of files using an array of regular expressions (regex)

How can I change this so that I can cycle through a list of files and compare them all to an array of regular expressions?

$Regex_FileList = @"
TestFile_53-1227.txt^Home/Client/
Testfile_R-122719.txt^Home/Client/
TestingAFile1219.csv^Home/Client/ 
Test_PMT_122719.txt^Home/Client/
This_is_a_TEST_122719.txt^Home/folder/
"@


$Regex_1 = "^TestFile_R-\d{1,6}"
$Regex_2 = "This_is_a_TEST_\d{1,6}\.txt"
$Regex_3 = "^Test_NB"


$Regex_Array = @($Regex_1,$Regex_2,$Regex_3)

[array]$files = $Regex_FileList -split '\r?\n'

$files = $files | Where-Object {$_} #filter out empty array vals

$finalfiles = @()

for($i;$i -lt $files.count;$i++){

    $finalfiles = $files | Where {$_ -notmatch $Regex_Array[$i]}

}

$finalfiles

I believe my problem is this particular line: $finalfiles = $files | Where {$_ -notmatch $Regex_Array[$i]}

If I do something like $files | Where {$_ -notmatch "^This"} of course the regex works, it takes This_is_a_TEST_122719.txt^Home/folder/ out of my $Regex_FileList. If I change it back to using $Regex_Array[$i] then the $finalfiles variable ends up blank.

I also tried this instead of the for loop $files | ForEach-Object { if($_ -notmatch $Regex_Array){$finalfiles += $_} }

Another thing I tried:

for($i;$i -lt $Regex_FileList.count;$i++){
    foreach($regex in $Regex_FileList){
        if($files[$i] -notmatch $_){
        $finalfiles += $files[$i]
        } 
    }
}

Upvotes: 0

Views: 868

Answers (2)

Lee_Dailey
Lee_Dailey

Reputation: 7489

the problem seems to be the overly convoluted steps you chose to take. when i run your code on win7ps5.1, i get the following error ...

Index operation failed; the array index evaluated to null.
At line:25 char:35
+     $finalfiles = $files | Where {$_ -notmatch $Regex_Array[$i]}
+                                   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo          : InvalidOperation: (:) [], RuntimeException
+ FullyQualifiedErrorId : NullArrayIndex

i cannot understand your logic, so i gave up on that. [blush]

here is how i would find the strings that DO NOT match your regex pattern list ...

what it does ...

  • builds a here-string, splits it into lines, and puts the results into an array
  • builds a regex pattern list
  • converts that pattern list into a regex OR structure by joing the items with | [the regex OR symbol]
  • runs -notmatch against the input list using the regex pattern
  • stores that in $NonMatches
  • displays it on the screen

the code ...

$Regex_FileList = @"
TestFile_53-1227.txt^Home/Client/
Testfile_R-122719.txt^Home/Client/
TestingAFile1219.csv^Home/Client/ 
Test_PMT_122719.txt^Home/Client/
This_is_a_TEST_122719.txt^Home/folder/
"@ -split [System.Environment]::NewLine


$PatternList = @(
    '^TestFile_R-\d{1,6}'
    'This_is_a_TEST_\d{1,6}\.txt'
    '^Test_NB'
    )

$RegexPatterList = $PatternList -join '|'

$NonMatches = $Regex_FileList -notmatch $RegexPatterList

$NonMatches

output ...

TestFile_53-1227.txt^Home/Client/
TestingAFile1219.csv^Home/Client/ 
Test_PMT_122719.txt^Home/Client/

Upvotes: 0

js2010
js2010

Reputation: 27516

Another pretty simple way. All you need is the comma operator to make arrays. The line property from select-string has the actual string result.

$FileList = 'TestFile_53-1227.txt^Home/Client/',
  'Testfile_R-122719.txt^Home/Client/',
  'TestingAFile1219.csv^Home/Client/',
  'Test_PMT_122719.txt^Home/Client/',
  'This_is_a_TEST_122719.txt^Home/folder/'

$PatternList = '^TestFile_R-\d{1,6}',
  'This_is_a_TEST_\d{1,6}\.txt',
  '^Test_NB'

$filelist | select-string -notmatch $patternlist | foreach line

# output
TestFile_53-1227.txt^Home/Client/
TestingAFile1219.csv^Home/Client/
Test_PMT_122719.txt^Home/Client/

Upvotes: 4

Related Questions