DevHawk
DevHawk

Reputation: 107

Issues using regular expression match in the IF statement in powershell

Input File content is at the bottom. The image shows better the file format.

As you can see from my input file it comes with lots of lines that I don´t need, so I´m trying to tell Powershell to read the content when line matches this pattern (see below). But it´s returning False and not doing what I´d like which is to copy all the content between the regex and the - sign which indicates where the block ends.

Any idea of what I´m doing wrong?enter image description here

$InputFile = gc "D:\input_file.txt"
$Dest = "D:\Desktop\Final_file.txt"

#PATTERN I´M LOOKING FOR:
 0000 00XKDPMBBRAXXX00000
 1965 81PWSLKDTRUGXX00000

#REGEX I´VE CREATED BASED ON ABOVE CONTENT
$re = [regex]'(\d{4}\s\d{2}\[a-z]{12}\d{5})'

$file_line_num = 0
$mesg_line_num = 0
$Dest_count     = 0

foreach ($line in $Input_File) {
  $file_line_num = $file_line_num + 1

  # Find where message starts, any other lines are ignored
  if ($line -match $re) {

     [void]$foreach.MoveNext() # skip lines not needed

     $msg_line_num = 0

     do {
        [void]$foreach.MoveNext()    
        $line = $foreach.current
        $msg_line_num = $msg_line_num + 1

        if ($msg_line_num -lt 3) {

           $header = $line.substring(7,8) + $line.substring(16, 3)
           add-content $Dest $header

        } else {
           add-content $Dest $line
        }

     } until ($line -eq "-" -or $line -eq $null) 
  }
}
Exit

text
----------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------
TEXTHERE TEXTHERE TEXTHERE
TEXTHERE
.TEXTHERE TEXTHERE TEXTHERE
TEXTHERE TEXTHERE
0000 00XKDPMBBRAXXX00000
1965 81PWSLKDTRUGXX00000
123 99
TEXTHERE
TEXTHERE//TEXTHERE
TEXTHERE
TEXTHERE
TEXTHERE
TEXTHERE//TEXTHERE
TEXTHERE//TEXTHERE
TEXTHERE TEXTHERE
TEXTHERE TEXTHERE
TEXTHERE TEXTHERE
-
=TEXTHERE TEXTHERE
=TEXTHERE TEXTHERE



NNNN++++++++++++++++++++++++++++++++++++
+                                      +
+     -- =TEXTHERE TEXTHERE            +
+      =TEXTHERE TEXTHERE              +
+                                      +
++++++++++++++++++++++++++++++++++++++++

TEXTHERE TEXTHERE TEXTHERE
TEXTHERE
.TEXTHERE TEXTHERE TEXTHERE
TEXTHERE TEXTHERE
0000 00XKDPMBBRAXXX00000
1965 81PWSLKDTRUGXX00000
123 99
TEXTHERE
TEXTHERE//TEXTHERE
TEXTHERE
TEXTHERE
TEXTHERE
TEXTHERE//TEXTHERE
TEXTHERE//TEXTHERE
TEXTHERE TEXTHERE
TEXTHERE TEXTHERE
TEXTHERE TEXTHERE
-
=TEXTHERE TEXTHERE
=TEXTHERE TEXTHERE



NNNN++++++++++++++++++++++++++++++++++++
+                                      +
+     -- =TEXTHERE TEXTHERE            +
+      =TEXTHERE TEXTHERE              +
+                                      +
++++++++++++++++++++++++++++++++++++++++

Upvotes: 0

Views: 894

Answers (2)

woxxom
woxxom

Reputation: 73686

\[a-z] should be [A-Z] - the slash is not needed because it produces a literal [, also [regex] class is case-sensitive unlike the usual -match operator.

Anyway, it's possible to shorten the code (PowerShell 3.0 and newer):

$all = ([regex]'(?s)(?<=(\d{4}\s\d{2}[a-zA-Z]{12}\d{5}\r?\n){2})(.*?)(?=\r?\n-\r?\n)').
    Matches((Get-Content source.txt -raw)).Value

Or PowerShell 2.0:

$all = ([regex]'(?s)(?<=(\d{4}\s\d{2}[a-zA-Z]{12}\d{5}\r?\n){2})(.*?)(?=\r?\n-\r?\n)').
    Matches([IO.File]::ReadAllText('r:\source.txt')) | Select -expand Value

To copy including the boundary lines too change the groups in the regexp:

'(?s)(?:\d{4}\s\d{2}[a-zA-Z]{12}\d{5}\r?\n){2}.*?\r?\n-\r?\n'

Upvotes: 3

user6811411
user6811411

Reputation:

> select-string .\input_file.txt -Pattern '(\d{4})\s(\d{2}[a-z]{12}\d{5})'

input_file.txt:8:0000 00XKDPMBBRAXXX00000
input_file.txt:9:1965 81PWSLKDTRUGXX00000
input_file.txt:38:0000 00XKDPMBBRAXXX00000
input_file.txt:39:1965 81PWSLKDTRUGXX00000

> select-string .\input_file.txt -Pattern '(\d{4})\s(\d{2}[a-z]{12}\d{5})'|%{$_.matches.captures.value}
0000 00XKDPMBBRAXXX00000
1965 81PWSLKDTRUGXX00000
0000 00XKDPMBBRAXXX00000
1965 81PWSLKDTRUGXX00000

> select-string .\input_file.txt -Pattern '(\d{4})\s(\d{2}[a-z]{12}\d{5})'|%{$_.matches.groups[1,2].value}
0000
00XKDPMBBRAXXX00000
1965
81PWSLKDTRUGXX00000
0000
00XKDPMBBRAXXX00000
1965
81PWSLKDTRUGXX00000

Upvotes: 0

Related Questions