Rob
Rob

Reputation: 361

How to match overlapping perl regex?

I am trying to get a regex to match an overlapping section when using /g. I believe I need to use lookbehind, but I'm having trouble understanding the documentation and getting it to match.

For example, the test case:

use feature ':5.18';
use warnings;
use diagnostics;
use utf8;

my $test = '1 1 1';
$test =~ s/(?=[0-9]+) ([0-9]+)/$1$2/g;
say $test;      # still get '1 1 1'

How do I get rid of the spaces? The output should be '111'.

Upvotes: 1

Views: 188

Answers (2)

ikegami
ikegami

Reputation: 386706

Problems are encountered when both adjacent digits are part of the match.

$test =~ s/([0-9])\s+([0-9])/$1$2/g;   # XXX Bad

Solutions:

$test =~ s/(?<=[0-9])\s+(?=[0-9])//g;

or the more efficient

$test =~ s/[0-9]\K\s+(?=[0-9])//g;   # 5.10+

In the former, the adjacent digits are never part of the match.

In the latter, only the preceding digit is part of the match.

Upvotes: 3

anubhava
anubhava

Reputation: 786349

To be able to remove spaces between digits you can use zero-width look-arounds assertions:

$test =~ s/(?<=[0-9])\s+(?=[0-9])//g;

Breakup:

  • (?<=[0-9]): Assert that we have a digit at previous position
  • \s+: Match 1+ whitespaces
  • (?=[0-9]): Assert that we have a digit at next position

Upvotes: 5

Related Questions