Reputation: 107
This is my input file which is random, can be any number not just 9999 and any letters:
The below format will always come after a -
(dash).
- 9999 99AKDSLY9ZWSRK99999 9999 99BGRPOE99FTRQ99999
Expected output:
AKDSLY9ZSRK BGRPOE99TRQ
So I need to remove the first part of each line, always numbers:
9999 99 9999 99
Then remove the not-required characters:
99AKDSLY9ZW
→ in this case is the W
but could be any letter
99BGRPOE99F
→ in this case is the F
but could be any letter
And finally remove the last 5 digits, always numbers:
99999 99999
What I´m trying to use, regex (first time using it):
$result = [regex]::Matches($InputFile, '(^\d{4}\s\d{2}[A-Z0-9]\d{5}$)') -replace '\d{4}\s\d{2}', '')
$result
It's not giving me an error message but it's not showing me the characters I'm expecting to see at $result
.
I was expecting to see something in $result
to then start the formatting, deleting the characters I don't need.
What could be missing here, please?
Upvotes: 0
Views: 640
Reputation: 200493
Try something like this:
$str = (Get-Content ... -Raw) -replace '\r'
$cb = {
$args[0].Groups[1].Value -replace '(?m)^.{7}' -replace '(?m).(.{3}).{5}$', '$1'
}
$re = [regex]'(?m)^(?<=-\n)((?:\d{4}\s\d{2}[^\n]*\d{5}(?:\n|$))+)'
$re.Replace($str, $cb)
The regular expression $re
matches multiline substrings that start with a hyphen and a newline, followed by one or more line with your digit/letter combinations. The (?<=...)
is a positive lookbehind assertion to ensure that you only get a match when the lines with the digit/letter combinations are preceded by a line with a hyphen (without making that line part of the actual match).
The scriptblock $cb
is an anonymous callback function that the Regex.Replace()
method calls on each match. For each line in a match it removes the first 7 characters from the beginning of the line, and replaces the last 9 characters from the end of the line with the 2nd through 4th of those characters.
For simplicity reasons the sample code removes carriage return characters (CR, \r
) from the string, so that all newlines are linefeed characters (LF, \n
) instead of the default CR-LF.
Upvotes: 1