Knows Not Much
Knows Not Much

Reputation: 31546

Scala Regex Positive and Negative Look Behind at the same time

I have an input string like this

val input = """["abc:def&ghi:jkl"]"""

I want to extract abc and ghi So I wrote this regex which works

val regex = """(?<=["&])(\w+)(?=[:])""".r
regex.findAllIn(input).foreach(println)

So basically I have a look ahead for : and a look behind for either " or &.

So far so good. But now I have an input like this

val input = """["abc:de_&_f:xyz&ghi:jkl"]"""

it matches

abc
_f
ghi

I want to change the logic of my regex to say.

Match a \w+ when look ahead is true for : and look behind is true for & and false for _&_

So I want to use the positive and negative look behind at the same time. How do I do that?

Upvotes: 4

Views: 807

Answers (2)

anubhava
anubhava

Reputation: 785641

You may add a negative lookbehind and a negative lookahead inside the lookbehind expression in your regex as:

(?<=(?:(?<!_)&(?!_)|"))\w+(?=:)

RegEx Demo

Here we are using an alternation in the lookbehind condition which is:

  • (?<!_)&(?!_)|": Match & if it is not preceded and followed by _
  • |: OR
  • " match "

For your case this shorter regex may also work:

(?<=["&])(?<!_&)\w+(?=:)

RegEx Demo 2

(?<!_&) will skip the match if \w+ is preceded by _&.

Upvotes: 2

The fourth bird
The fourth bird

Reputation: 163477

You could update your pattern to not match the underscore in \w first using a negated character class [^\W_]\w*

As you want a single match only, you can omit the capturing group () and the square brackets in [:] can be omitted.

(?<=["&])[^\W_]\w*(?=:)
  • (?<=["&]) Positive lookbehind, assert what is on the left is " or &
  • [^\W_] Match a word char except _
  • \w* Match 0+ word chars
  • (?=:) Positive lookahead, assert what is on the right is :

Regex demo | Scala demo

Upvotes: 2

Related Questions