user3778578
user3778578

Reputation: 215

Get string between - Find all occurrences PHP

I found this function which finds data between two strings of text, html or whatever.

How can it be changed so it will find all occurrences? Every data between every occurrence of $start [some-random-data] $end. I want all the [some-random-data] of the document (It will always be different data).

function getStringBetween($string, $start, $end) {
    $string = " ".$string;
    $ini = strpos($string,$start);
    if ($ini == 0) return "";
    $ini += strlen($start);
    $len = strpos($string,$end,$ini) - $ini;
    return substr($string,$ini,$len);
}

Upvotes: 17

Views: 17331

Answers (6)

Kim Steinhaug
Kim Steinhaug

Reputation: 502

There was some great sollutions here, however not perfekt for extracting parts of code from say HTML which was my problem right now, as I need to get script blocks out of the HTML before compressing the HTML. So building on @raina77ow original sollution, expanded by @Cas Tuyn I get this one:

$test_strings = [
    '0<p>a</p>1<p>b</p>2<p>c</p>3',
    '0<p>a</p>1<p>b</p>2<p>c</p>',
    '<p>a</p>1<p>b</p>2<p>c</p>3',
    '<p>a</p>1<p>b</p>2<p>c</p>',
    '<p></p>1<p>b'
];

/**
* Seperate a block of code by sub blocks. Example, removing all <script>...<script> tags from HTML kode
* 
* @param string $str, text block
* @param string $startDelimiter, string to match for start of block to be extracted
* @param string $endDelimiter, string to match for ending the block to be extracted
* @return array [all full blocks, whats left of string]
*/
function getDelimitedStrings($str, $startDelimiter, $endDelimiter) {
    $contents = array();
    $startDelimiterLength = strlen($startDelimiter);
    $endDelimiterLength = strlen($endDelimiter);
    $startFrom = $contentStart = $contentEnd = $outStart = $outEnd = 0;
    while (false !== ($contentStart = strpos($str, $startDelimiter, $startFrom))) {
        $contentStart += $startDelimiterLength;
        $contentEnd = strpos($str, $endDelimiter, $contentStart);
        $outEnd = $contentStart - 1;
        if (false === $contentEnd) {
            break;
        }
        $contents['in'][] = substr($str, ($contentStart-$startDelimiterLength), ($contentEnd + ($startDelimiterLength*2) +1) - $contentStart);
        if( $outStart ){
            $contents['out'][] = substr($str, ($outStart+$startDelimiterLength+1), $outEnd - $outStart - ($startDelimiterLength*2));
        } else if( ($outEnd - $outStart - ($startDelimiterLength-1)) > 0 ){
            $contents['out'][] = substr($str, $outStart, $outEnd - $outStart - ($startDelimiterLength-1));
        }
        $startFrom = $contentEnd + $endDelimiterLength;
        $startFrom = $contentEnd;
        $outStart = $startFrom;
    }
    $total_length = strlen($str);
    $current_position = $outStart + $startDelimiterLength + 1;
    if( $current_position < $total_length )
        $contents['out'][] = substr($str, $current_position);

    return $contents;
}

foreach($test_strings AS $string){
    var_dump( getDelimitedStrings($string, '<p>', '</p>') );
}

This will extract all

wlements with the possible innerHTML aswell, giving this result:

array (size=2)
'in' => array (size=3)
    0 => string '<p>a</p>' (length=8)
    1 => string '<p>b</p>' (length=8)
    2 => string '<p>c</p>' (length=8)
'out' => array (size=4)
    0 => string '0' (length=1)
    1 => string '1' (length=1)
    2 => string '2' (length=1)
    3 => string '3' (length=1)

array (size=2)
'in' => array (size=3)
    0 => string '<p>a</p>' (length=8)
    1 => string '<p>b</p>' (length=8)
    2 => string '<p>c</p>' (length=8)
'out' => array (size=3)
    0 => string '0' (length=1)
    1 => string '1' (length=1)
    2 => string '2' (length=1)

array (size=2)
'in' => array (size=3)
    0 => string '<p>a</p>' (length=8)
    1 => string '<p>b</p>' (length=8)
    2 => string '<p>c</p>' (length=8)
'out' => array (size=3)
    0 => string '1' (length=1)
    1 => string '2' (length=1)
    2 => string '3' (length=1)

array (size=2)
'in' => array (size=3)
    0 => string '<p>a</p>' (length=8)
    1 => string '<p>b</p>' (length=8)
    2 => string '<p>c</p>' (length=8)
'out' => array (size=2)
    0 => string '1' (length=1)
    1 => string '2' (length=1)

array (size=2)
'in' => array (size=1)
    0 => string '<p></p>' (length=7)
'out' => array (size=1)
    0 => string '1<p>b' (length=5)

You can see a demo here: 3v4l.org/TQLmn

Upvotes: 1

Shamim
Shamim

Reputation: 41

I love to use explode to get string between two string. this function also works for multiple occurrences.

function GetIn($str,$start,$end){
    $p1 = explode($start,$str);
    for($i=1;$i<count($p1);$i++){
        $p2 = explode($end,$p1[$i]);
        $p[] = $p2[0];
    }
    return $p;
}

Upvotes: 3

Grows
Grows

Reputation: 53

I needed to find all these occurences between specific first and last tag and change them somehow and get back changed string.

So I added this small code to raina77ow approach after the function.

        $sample = '<start>One<end> aaa <start>TwoTwo<end> Three <start>Four<end> aaaaa <start>Five<end>';
        $sample_temp = getContents($sample, '<start>', '<end>');
        $i = 1;
        foreach($sample_temp as $value) {
            $value2 = $value.'-'.$i; //there you can change the variable
            $sample=str_replace('<start>'.$value.'<end>',$value2,$sample);
            $i = ++$i;
        }
        echo $sample;

Now output sample has deleted tags and all strings between them has added number like this:

One-1 aaa TwoTwo-2 Three Four-3 aaaaa Five-4

But you can do whatever else with them. Maybe could be helpful for someone.

Upvotes: 2

Cas Tuyn
Cas Tuyn

Reputation: 11

I also needed the text outside the pattern. So I changed the answer from raina77ow above a little:

function get_delimited_strings($str, $startDelimiter, $endDelimiter) {
    $contents = array();
    $startDelimiterLength = strlen($startDelimiter);
    $endDelimiterLength = strlen($endDelimiter);
    $startFrom = $contentStart = $contentEnd = $outStart = $outEnd = 0;
    while (false !== ($contentStart = strpos($str, $startDelimiter, $startFrom))) {
        $contentStart += $startDelimiterLength;
        $contentEnd = strpos($str, $endDelimiter, $contentStart);
        $outEnd = $contentStart - 1;
        if (false === $contentEnd) {
            break;
        }
        $contents['in'][] = substr($str, $contentStart, $contentEnd - $contentStart);
        $contents['out'][] = substr($str, $outStart, $outEnd - $outStart);
        $startFrom = $contentEnd + $endDelimiterLength;
        $outStart = $startFrom;
    }
    $contents['out'][] = substr($str, $outStart, $contentEnd - $outStart);
    return $contents;
}

Usage:

    $str = "Bore layer thickness [2 mm] instead of [1,25 mm] with [0,1 mm] deviation.";
    $cas = get_delimited_strings($str, "[", "]");

gives:

array(2) { 
    ["in"]=> array(3) { 
        [0]=> string(4) "2 mm" 
        [1]=> string(7) "1,25 mm" 
        [2]=> string(6) "0,1 mm" 
    } 
    ["out"]=> array(4) { 
        [0]=> string(21) "Bore layer thickness " 
        [1]=> string(12) " instead of " 
        [2]=> string(6) " with " 
        [3]=> string(10) " deviation" 
    } 
}

Upvotes: 0

ifm
ifm

Reputation: 1196

You can do this using regex:

function getStringsBetween($string, $start, $end)
{
    $pattern = sprintf(
        '/%s(.*?)%s/',
        preg_quote($start),
        preg_quote($end)
    );
    preg_match_all($pattern, $string, $matches);

    return $matches[1];
}

Upvotes: 10

raina77ow
raina77ow

Reputation: 106385

One possible approach:

function getContents($str, $startDelimiter, $endDelimiter) {
  $contents = array();
  $startDelimiterLength = strlen($startDelimiter);
  $endDelimiterLength = strlen($endDelimiter);
  $startFrom = $contentStart = $contentEnd = 0;
  while (false !== ($contentStart = strpos($str, $startDelimiter, $startFrom))) {
    $contentStart += $startDelimiterLength;
    $contentEnd = strpos($str, $endDelimiter, $contentStart);
    if (false === $contentEnd) {
      break;
    }
    $contents[] = substr($str, $contentStart, $contentEnd - $contentStart);
    $startFrom = $contentEnd + $endDelimiterLength;
  }

  return $contents;
}

Usage:

$sample = '<start>One<end>aaa<start>TwoTwo<end>Three<start>Four<end><start>Five<end>';
print_r( getContents($sample, '<start>', '<end>') );
/*
Array
(
    [0] => One
    [1] => TwoTwo
    [2] => Four
    [3] => Five
)
*/ 

Demo.

Upvotes: 53

Related Questions