pake10
pake10

Reputation: 21

PHP preg_split, split by same characters

I'm trying to split a string with preg_split. Here's an example of the string: 111235622411 I want the output to be like this:

$arr[0] = "111";
$arr[1] = "2";
$arr[2] = "3";
$arr[3] = "5";
$arr[4] = "6";
$arr[5] = "22";
$arr[6] = "4";
$arr[7] = "11";

So if there's the same characters one after the other, I want them in the same "chunk". I just can't come up with the regular expression I should use. I'm sorry if some of the terms are wrong, because it has been some time since I coded PHP before.

Upvotes: 2

Views: 751

Answers (3)

Academia
Academia

Reputation: 4124

Following, a simple solution that consists in executing a preg_match_all:

The regex in this case is:

(\d)\1*

Signification of the regex:

  • (\d): 1st Capturing group. \d match a digit [0-9].
  • \1 matches the same text as most recently matched by the 1st capturing group.
  • *: Quantifier between zero and unlimited times.

The php code would be:

$re = "/(\\d)\\1*/"; 
$str = "111235622411"; 

preg_match_all($re, $str, $matches);
print_r($matches[0]);

You can access for example the first matching group which is "111" like this: $matches[0][0], the second which is "2" like this $matches[0][1], and so on. Check here Demo to see a working example.


Hope it's useful!

Upvotes: 0

hek2mgl
hek2mgl

Reputation: 158020

I would use preg_match_all():

$string = '111235622411';

preg_match_all('/(.)\1*/', $string, $matches);
var_dump($matches[0]);

\1 references the previously captured group (.) (any single character). This feature is called back referencing. The regex repeats the previously matched character - greedy * meaning it matches as much equal characters as possible, what was desired in the question.

Output:

array(8) {
  [0]=>
  string(3) "111"
  [1]=>
  string(1) "2"
  [2]=>
  string(1) "3"
  [3]=>
  string(1) "5"
  [4]=>
  string(1) "6"
  [5]=>
  string(2) "22"
  [6]=>
  string(1) "4"
  [7]=>
  string(2) "11"
}

Upvotes: 4

Amit Joki
Amit Joki

Reputation: 59252

You can use this regex:

(.)(?=\1)\1+|\d

And instead of splitting it, take the matches.

$matches = null;
$returnValue = preg_match_all('/(.)(?=\\1)\\1+|\\d/', '111235622411', $matches);

And the $matches[0] will contain what you want. As @hek2mgl has suggested, you can also use the simpler /(\d)\1*/

DEMO

Upvotes: 1

Related Questions