Reputation: 2482
I have a sample text 8 Pair 20+22AWG (7x28) Bare Copper, aDIN PVC DIN
I need to extract specific keywords AWG
and DIN
given that they are not preceeded or followed by alphapets
I tried this expression [^a-zA-Z]+AWG|DIN
but it also extracts 20+22
how can I limit the expression to exact keywords ?
Upvotes: 0
Views: 69
Reputation: 163207
You are currently matching 1+ chars other than a-zA-Z followed by matching AWG
OR you match only DIN
.
You could make the distinction using a capturing group (AWG|DIN)
If lookarounds are not supported, you could use the capturing group with your negated character class (without the quantifier +
as you only need to verify a single char).
(?:[^a-zA-Z]|^)(AWG|DIN)(?:[^a-zA-Z]|$)
(?:[^a-zA-Z]|^)
Match any char other then a-zA-Z or start of string(AWG|DIN)
Capture in group 1 either AWG
or DIN
(?:[^a-zA-Z]|$)
Match any char other then a-zA-Z or end of stringUpvotes: 1
Reputation: 626689
You may use a negative lookahead and you need to fix the regex by adding a grouping construct around the values you want to extract:
(?<![a-zA-Z])(?:AWG|DIN)(?![a-zA-Z])
See the regex demo
Details
(?<![a-zA-Z])
- no letter allowed immediately on the left(?:AWG|DIN)
- either AWG
or DIN
letter sequences(?![a-zA-Z])
- no letter allowed immediately on the right.Upvotes: 1