Reputation: 6295
I'm monitoring incoming e-mail subjects, and each subject may contain a particularly formatted code inside it which I used to reference something else with down the line.
These codes can be anywhere within the string, and sometimes not at all - and so the problem I'm having is my lack of RegEx skills (which I assume is the best option for this solution?).
An example of a subject would be:
"Please refer to reference MZ5051CLA"
or
"Attention for Mr Danshi, RE. 11123MTX"
The codes I'm looking to extract in these scenarios are "MZ5051CLA" and "11123MTX".
The format of MZ5051CLA will be:
- Always starts with "MZ"
- Follows by a number
- Always ends with "CLA"
Is there a simple way to evaluate the subject as a whole and extract any words that match the codes only?
I've looked at various solutions to my problem here on SO, but they're either overly complicated or I can't quite relate.
Edit:
As ShashishChandra pointed out, the idea is to monitor multiple mailboxes, each with their own code formats. So my idea was to implement a regex setting for each mailbox.
Perhaps this was important to mention initially, since a solution to catch all formats in one regex won't work. Apologies for that.
Upvotes: 1
Views: 96
Reputation: 499
So, in that case if you don't mind false positives, then use: /^(?=.*[0-9])(?=.*[A-Z])([A-Z0-9]+)$/
. This will work well in general.
Upvotes: 0
Reputation: 41838
Both Codes in One Pattern
It seems that the codes must include at least one uppercase letter and at least one digit. For that kind of pattern, a password-validation technique is commonly used, and I would suggest:
\b(?=[A-Z0-9]*[A-Z])[A-Z0-9]*[0-9][A-Z0-9]*
In the demo, see how only the correct groups are matched. Of course false positives are possible.
Reference
Upvotes: 1
Reputation: 174706
The below regex would match only the first string MZ5051CLA
\bMZ\d+CLA\b
But this would match the both strings MZ5051CLA
and 11123MTX
,
\b[A-Z0-9]+$
All alphanumeric characters present at the last of a line are matched.
This would get you the Alphanumeric string which starts with MZ
and ends with CLA
or starts with a number and ends with mtx
(?:\b[A-Z0-9]+$|\b\d+MTX\b)
Upvotes: 1