Anshul Dubey
Anshul Dubey

Reputation: 143

Php:Regex-How to extract multiple portions of match and store it in an array?

I have a page with the following code:-

<ul class="food">
<li>
<i>Bread and Butter</i>
</li>
<li>
<i>Cheese</i>
</li>
<li>
<i>Milk</i>
</li>
</ul>
<ul class="fruits">
<li>
<i>Apple</i>
</li>
<li>
<i>Mango</i>
</li>
<li>
<i>Strawberry</i>
</li>
</ul>

There are two unordered lists, and I want the contents which is between the italic tag, in an array per every unordered list. For example, Apple, Mango and Strawberry, which are the contents of second unordered list should be stored in one array ,say array[1], and the contents of other unordered list should be stored in array[0]. Also the number of items in any of the unordered list is variable or not known beforehand, which is another problem. The regex I tried was

<ul class=".*">\s(?:<li>\s<i>(.*)<\/i>\s<\/li>)+<\/ul>

Apart from that regex, I tried many other the whole day, but no success. I am new to regex and php and do not have much idea. Can someone help me with this? EDIT: I am allowed only to use regex to get content. Parsing is not allowed

Upvotes: 0

Views: 100

Answers (1)

Mustofa Rizwan
Mustofa Rizwan

Reputation: 10466

Split The full String By:

/<ul.*?>/m

Then iterate over the splits and apply th following regex to capture italics value"

/<i>(.*?)<\/i>/m

Run it here

Src:

<?php

$re = '/<ul.*?>/m';
$re1 = '/<i>(.*?)<\/i>/m';
$str = '<ul class="food">
<li>
<i>Bread and Butter</i>
</li>
<li>
<i>Cheese</i>
</li>
<li>
<i>Milk</i>
</li>
</ul>
<ul class="fruits">
<li>
<i>Apple</i>
</li>
<li>
<i>Mango</i>
</li>
<li>
<i>Strawberry</i>
</li>
</ul>';

$list=preg_split($re,$str);
for($i=1;$i<count($list);$i++)
{
    preg_match_all($re1, $list[$i], $matches);
    print_r($matches[1]);
}
?>

Sample Output:

Array
(
    [0] => Bread and Butter
    [1] => Cheese
    [2] => Milk
)
Array
(
    [0] => Apple
    [1] => Mango
    [2] => Strawberry
)

Upvotes: 1

Related Questions