user1307016
user1307016

Reputation: 405

PHP preg-match-all regex

I would like to capture each of these in their own group with preg_match_all in PHP.

  1. The chapter, section, or page
  2. The number (or letter if it has one) of the specified chapter, section, or page. If there is a single space between them it should be taken into account
  3. The words "and", "or"

Keeping in mind that the number of items in the string may be dynamic, the regex should work on all the examples below:

  1. Ch1 and Sect2b
  2. Ch 4 x blahunwantedtext and Sect 5y and Sect6 z and Ch7 or Ch8

This is what I managed to come up with so far:

<?php

    $str = 'Ch 1 a and Sect 2b and Pg3';
    preg_match_all ('/([a-z]+)([\s]?[0-9]+)([\s]?[a-z]*)([\s]?and*[\s]?)/is', $str, $matches);

    Array
    (
        [0] => Array
            (
                [0] => Ch 1 a and 
                [1] => Sect 2b and 
            )

        [1] => Array
            (
                [0] => Ch
                [1] => Sect
            )

        [2] => Array
            (
                [0] =>  1
                [1] =>  2
            )

        [3] => Array
            (
                [0] =>  a
                [1] => b
            )

        [4] => Array
            (
                [0] =>  and 
                [1] =>  and 
            )

    )

I'm unable to match the last portions of the string (Pg3) in my array.

The expected result should be:

    Array
    (
        [0] => Array
            (
                [0] => Ch 1 a and 
                [1] => Sect 2b and 
                [2] => Pg3
            )

        [1] => Array
            (
                [0] => Ch
                [1] => Sect
                [2] => Pg
            )

        [2] => Array
            (
                [0] =>  1
                [1] =>  2
                [2] =>  3
            )

        [3] => Array
            (
                [0] =>  a
                [1] => b
                [2] => 
            )

        [4] => Array
            (
                [0] =>  and 
                [1] =>  and 
                [2] =>  
            )

    )

Upvotes: 1

Views: 2742

Answers (1)

Supericy
Supericy

Reputation: 5896

This regex should work /(ch|sect|pg)\s*(\d)\s*([a-z]?\b)\s*(and|or)?/i:

$str = 'Ch 1 a and Sect 2b and Pg3';
preg_match_all('/(ch|sect|pg)\s*(\d)\s*([a-z]?\b)\s*(and|or)?/i', $str, $matches);


array (size=5)
  0 => 
    array (size=3)
      0 => string 'Ch 1 a and' (length=10)
      1 => string 'Sect 2b and' (length=11)
      2 => string 'Pg3' (length=3)
  1 => 
    array (size=3)
      0 => string 'Ch' (length=2)
      1 => string 'Sect' (length=4)
      2 => string 'Pg' (length=2)
  2 => 
    array (size=3)
      0 => string '1' (length=1)
      1 => string '2' (length=1)
      2 => string '3' (length=1)
  3 => 
    array (size=3)
      0 => string 'a' (length=1)
      1 => string 'b' (length=1)
      2 => string '' (length=0)
  4 => 
    array (size=3)
      0 => string 'and' (length=3)
      1 => string 'and' (length=3)
      2 => string '' (length=0)

Upvotes: 1

Related Questions