Reputation: 21
I have an EDI file. This is the piece in question:
N1*ST*TEST
N3*ADDRESS
N4*CITY*ST*POSTAL
PER*EM*[email protected]
N1*BY*TEST
N3*ADDRESS
N4*CITY*ST*POSTAL
PER*EM*[email protected]
I am using powershell
Get-ChildItem 'C:\Temp\*.edi' | Where-Object {(Select-String -InputObject $_ -Pattern 'PER\*EM\*\w+@\w+\.\w+' -List)}
I want to find the email address that appears after the N1*ST, but before the N1*BY. I have the expression that works for an email address but I am stuck on how to only get the one value. The real issue is sometimes the email is there and other times it is not. So I really do want to ignore that second email after the N1*BY.
Thanks in advance for the help.
Upvotes: 1
Views: 47
Reputation: 627082
You can use
(?s)(?<=N1\*ST.*)PER\*EM\*\w+@\w+\.\w+(?=.*N1\*BY)
See the .NET regex demo.
Details
(?s)
- a DOTALL (RegexOptions.Singleline
in .NET) regex inline modifier making .
match newline chars, too(?<=N1\*ST.*)
- a positive lookbehind that matches a location immediaely preceded with N1*ST
PER\*EM\*
-a PER*EM*
string\w+@\w+
- 1+ word chars, @
, and 1+ word chars\.
- a dot\w+
- 1+ word chars(?=.*N1\*BY)
- a positive lookahead that matches a location immediaely followed with N1*BY
literal string.NOTE: You need to read in the file contents with Get-Content $filepath -Raw
in order to find the proper match.
Something like
Get-ChildItem 'C:\Temp\*.edi' | % { Get-Content $_ -Raw | Select-String -Pattern '(?s)(?<=N1\*ST.*)PER\*EM\*\w+@\w+\.\w+(?=.*N1\*BY)' } | % { $_.Matches.value }
Upvotes: 1