DaLoco
DaLoco

Reputation: 201

Split string containing numbered list items before each number followed by a dot then a space

My php code goes like this:

$str = "1. What is love? a. Haddaway b. Haxxaway c. Hassaway d. Hannaway 2. What is love 1.1? a. Haddaway b. Haxxaway c. Hassaway d. Hannaway";

echo "<pre>";
print_r(preg_split('/(?=[\d]+\.)/', $str, -1, PREG_SPLIT_NO_EMPTY));
echo "<pre/>";

And the output is:

Array
    (
        [0] => 1. What is love? a. Haddaway b. Haxxaway c. Hassaway d. Hannaway 
        [1] => 2. What is love? a. Haddaway b. Haxxaway c. Hassaway d. Hannaway
    )

I'm trying to solve this problem that I discovered. And the problem is when my $str is like this: $str = "1. What is love 1.1? a. Haddaway b. Haxxaway c. Hassaway d. Hannaway

I'm getting:

Array
(
    [0] => 1. What is love 
    [1] => 1.1? a. Haddaway b. Haxxaway c. Hassaway d. Hannaway 
    [2] => 2. What is love? a. Haddaway b. Haxxaway c. Hassaway d. Hannaway
)

I don't want "1.1" to be on the index[1], I want it to stay on index[0].

I've tried tweaking the pattern which I'm using in preg_split() but I'm failing on what I want to achieve...

Can anyone give some advice on what should I do?

Upvotes: 1

Views: 74

Answers (3)

mickmackusa
mickmackusa

Reputation: 47764

For cleaner output and a better performing pattern, match a space then look ahead for the integer followed by the dot and a space. Demo

$str = "1. What is love? a. Haddaway b. Haxxaway c. Hassaway d. Hannaway 2. What is love 1.1? a. Haddaway b. Haxxaway c. Hassaway d. Hannaway";

var_export(
    preg_split(
        '/ (?=\d+\. )/',
        $str,
    )
);

Output:

array (
  0 => '1. What is love? a. Haddaway b. Haxxaway c. Hassaway d. Hannaway',
  1 => '2. What is love 1.1? a. Haddaway b. Haxxaway c. Hassaway d. Hannaway',
)

Upvotes: 0

Avinash Raj
Avinash Raj

Reputation: 174696

Here you could use boundary \B which matches between two non-word characters and two word characters.

$str = "1. What is love? a. Haddaway b. Haxxaway c. Hassaway d. Hannaway 2. What is love 1.1? a. Haddaway b. Haxxaway c. Hassaway d. Hannaway";
print_r(preg_split('~(?=\d+\.\B)~', $str,-1, PREG_SPLIT_NO_EMPTY));

Output:

Array
(
    [0] => 1. What is love? a. Haddaway b. Haxxaway c. Hassaway d. Hannaway 
    [1] => 2. What is love 1.1? a. Haddaway b. Haxxaway c. Hassaway d. Hannaway
)

Upvotes: 1

anubhava
anubhava

Reputation: 784878

You can use a negative lookahead to make sure there is no digit following DOT:

print_r(preg_split('/(?=\d+\.(?!\d))/', $str, -1, PREG_SPLIT_NO_EMPTY));
Array
(
    [0] => 1. What is love? a. Haddaway b. Haxxaway c. Hassaway d. Hannaway
    [1] => 2. What is love 1.1? a. Haddaway b. Haxxaway c. Hassaway d. Hannaway
)

Upvotes: 1

Related Questions