Reputation: 1523
I try to extract all values from a text, between two values, ex: <p>
/ <\/p>
Right now I can extract only the first one.
public function get_string_between($string, $start, $end)
{
$string = ' ' . $string;
$ini = strpos($string, $start);
if ($ini == 0) return '';
$ini += strlen($start);
$len = strpos($string, $end, $ini) - $ini;
return substr($string, $ini, $len);
}
$fullstring = '[{"content":{"content":"<h1>Acceptances<\/h1>","numbering":""},"children":[{"content":{"content":"<p><span>Ownership of the Products remains with the [X] and will not pass to the [Y] until one of the following events occurs:<\/span><\/p>","numbering":""},"children":[{"content":{"content":"<p><span>The [X] is paid for all of the Products and no other amounts are owed by the [Y] to the [X] in respect of other Products supplied by the [X].<\/span><\/p>","numbering":""},"children":[]},{"content":{"content":"<p><span>The [Y] sells the Products in accordance with this agreement in which case ownership of the Products will pass to the [Y] immediately before the Products are delivered to the [Y]'s customer.<\/span><\/p>","numbering":""},"children":[]}]},{"content":{"content":"<p><span>Where the Products are attached to or incorporated in other Products or are altered by the [Y], ownership of the Products shall not pass to the [Y] by virtue of the attachment, incorporation or alteration if the Products remain identifiable and, where attached to or incorporated in other Products, can be detached or removed from them.<\/span><\/p>","numbering":""}';
$paragraph_start_1 = '<p>';
$paragraph_end_2 = '<\/p>';
$paragraph = $this->get_string_between($fullstring, $paragraph_start_1, $paragraph_end_2);
//The output is just the first one and I need all.
Upvotes: 2
Views: 127
Reputation: 6646
Use regex instead:
public function get_string_between($string, $start, $end)
{
$re = $start.'(.*?)'.$end.'/m';
preg_match_all($re, $string, $matches, PREG_SET_ORDER, 0);
return($matches);
}
If you want to test the regex:
$re = '/<p>(.*?)<\\\\\/p>/m';
$str = '<p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Aliquam pulvinar sollicitudin risus, et aliquam ante efficitur non. Pellentesque vel lorem euismod, efficitur turpis eu, vehicula tellus. Aliquam pretium nulla a ex sollicitudin fringilla. Praesent lacus nibh, consequat nec imperdiet nec, volutpat id lacus. Suspendisse tristique nisl sapien, imperdiet lobortis lectus vulputate dapibus. Curabitur vulputate enim felis. Curabitur vehicula risus et nisi vehicula luctus. Quisque id urna ut sem volutpat accumsan. Curabitur ut odio faucibus massa ultricies auctor. Curabitur id vulputate mi, dignissim varius turpis. In hac habitasse platea dictumst. Proin suscipit ex ut neque facilisis pellentesque. Ut et efficitur sapien.</p>
<p>Nulla facilisi. Phasellus maximus dui sed maximus sodales. Aliquam imperdiet est a elit sollicitudin, id lobortis lectus vehicula. Sed ut accumsan ligula. Maecenas id scelerisque risus, non pharetra nisi. Praesent rhoncus sem turpis, sed fermentum orci aliquet et. Sed vitae turpis id eros commodo maximus. Praesent fringilla eros nisl, ac cursus mauris iaculis vel. Donec vulputate ornare augue eget pulvinar.</p>';
preg_match_all($re, $str, $matches, PREG_SET_ORDER, 0);
// Print the entire match result
var_dump($matches);
Upvotes: 1
Reputation: 896
Only use regex as solution for this kind of problems if you're absolutely sure the input string always follows the same kind of format. For example: Always one <p> but position is unknown.
Else, please extract the text using native DOM or XML parsers. See this extensive answer: How do you parse and process HTML/XML in PHP?
Upvotes: 1