Reputation: 69
I have an email that I wish to parse, its body contains stuff like
[Event Type] HireEmployee
[REQUESTOR] POLM4
[SIN] 092
[Employee Name] JOHN,SMITH
[Existing payroll record] False
[Existing PERM OA Mnemonic]
I need to be able to parse out the information after each header to store into a variable.
(\[REQUESTOR\]\t)[a-zA-Z0-9]+
will get me the line
[REQUESTOR] POLM4
but I only want it to return "POLM4"
Thanks
EDIT: I'm doing my testing on http://regexpal.com/
Upvotes: 0
Views: 2808
Reputation: 149
You need to store your result [REQUESTOR] POLM4
in a variable as var1.
and use regular expression on var1 as ^[^\)]*\]
.
This will remove the characters before ]
including ]
. So you'll get your required string as POLM4
.
Upvotes: 0
Reputation: 15765
put the stuff you dont want in a non-capture group.
For example, instead of your original expression, you do:
(?:\[REQUESTOR\]\t)([a-zA-Z0-9]+)
No2 the [REQUESTOR] is in a non-capture group and the rest is in the capture group.
Non-capture groups are groups you want to check, but not have saved.
Upvotes: 1
Reputation: 638
You can do a positive look behind. Your regex would become for example
(?<=\[REQUESTOR\]\t)[a-zA-Z0-9]+
It uses [REQUESTOR] to match with but does not include it in the match itself.
Upvotes: 0