Benjamin W.
Benjamin W.

Reputation: 52191

Split binary number into groups of zeros and ones

I have a binary number, for example 10000111000011, and want to split it into groups of consecutive 1s and 0s, 1 0000 111 0000 11.

I thought that's a great opportunity to use look-arounds: my regex uses a positive look-behind for a digit (which it captures for later backreferencing), then a negative look-ahead for that same digit (using a backreference), so I should get a split whenever a digit is followed by a digit that is not the same.

use strict;
use warnings;
use feature 'say';

my $bin_string = '10000111000011';
my @groups = split /(?<=(\d))(?!\g1)/, $bin_string;

say "@groups";

However, this results in

1 1 0000 0 111 1 0000 0 11 1

Somehow, the captured digit is inserted at every split. What did go wrong?

Upvotes: 3

Views: 1718

Answers (3)

vks
vks

Reputation: 67968

(?<=0)(?=1)|(?<=1)(?=0)

Simply split by this.See demo.

https://regex101.com/r/fM9lY3/3

The lookarounds will find place where there is 0 behind and 1 ahead or 1 behind and 0 ahead.Thus resulting in correct split without consuming anything.

Upvotes: 1

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626961

Here is a small fix for your code:

my @groups = split /(?<=0(?!0)|1(?!1))/, $bin_string;

The problem you experience is that when using split captured texts are also output in the resulting array. So, the solution is to get rid of the capturing group.

Since you only have 0 or 1 in your input, it is pretty easy with an alternation and a lookahead making sure the digits get changed.

See demo

Upvotes: 3

Avinash Raj
Avinash Raj

Reputation: 174736

Just do matching instead of splitting.

(\d)\1*

Example:

use strict;
use warnings;
use feature 'say';

my $bin_string = '10000111000011';
while($bin_string =~ m/((\d)\2*)/g) {
    print "$1\n";
}

IDEONE

Upvotes: 1

Related Questions