Exorcismus
Exorcismus

Reputation: 2482

How to extract specific text after certain pattern

I have a sample text 8 Pair 20+22AWG (7x28) Bare Copper, aDIN PVC DIN

I need to extract specific keywords AWG and DIN given that they are not preceeded or followed by alphapets

I tried this expression [^a-zA-Z]+AWG|DIN but it also extracts 20+22 how can I limit the expression to exact keywords ?

Upvotes: 0

Views: 69

Answers (2)

The fourth bird
The fourth bird

Reputation: 163207

You are currently matching 1+ chars other than a-zA-Z followed by matching AWG OR you match only DIN.

You could make the distinction using a capturing group (AWG|DIN)

If lookarounds are not supported, you could use the capturing group with your negated character class (without the quantifier + as you only need to verify a single char).

(?:[^a-zA-Z]|^)(AWG|DIN)(?:[^a-zA-Z]|$)
  • (?:[^a-zA-Z]|^) Match any char other then a-zA-Z or start of string
  • (AWG|DIN) Capture in group 1 either AWG or DIN
  • (?:[^a-zA-Z]|$) Match any char other then a-zA-Z or end of string

Regex demo

Upvotes: 1

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626689

You may use a negative lookahead and you need to fix the regex by adding a grouping construct around the values you want to extract:

(?<![a-zA-Z])(?:AWG|DIN)(?![a-zA-Z])

See the regex demo

Details

  • (?<![a-zA-Z]) - no letter allowed immediately on the left
  • (?:AWG|DIN) - either AWG or DIN letter sequences
  • (?![a-zA-Z]) - no letter allowed immediately on the right.

Upvotes: 1

Related Questions