Irene T.
Irene T.

Reputation: 1393

Extract inner text from a div by id using an html string

i have an html string with the following divs only:

<div id="title">My Title</div>
<div id="image">http://www.mpahmplakdjfe.co.uk/images/01.jpg</div>
<div id="fullcontent">In this div there are some html elements more</div>

I need to extract the inner text from divs "My title" etc.

how is it possible to do this with preg_match?

I tried the following (simple html dom) without luck:

$html = new simple_html_dom();
$html->load_file($myhtml);
$ret = $html->find('div[id=title]')->innertext; (or outter) 
echo $ret;

Thanks !!!!

Upvotes: 3

Views: 4418

Answers (3)

Mojtaba Rezaeian
Mojtaba Rezaeian

Reputation: 8736

I had the same question and I found solution by using regex. Here is the answer for your case:

\<div.*?\>(.*?)<\/div>

Upvotes: 0

1111161171159459134
1111161171159459134

Reputation: 1215

preg_match('|<[^>]*title[^>]*>(.*?)<|', $html, $m);

will give you "My Title".

preg_match('|<[^>]*image[^>]*>(.*?)<|', $html, $m);

will give you "http//www.mpahmplakdjfe.co.uk/images/01.jpg".

preg_match('|<[^>]*fullcontent[^>]*>(.*?)<|', $html, $m);

will give you "some text here".

You can do it that way:

$html = '<div id="title">My Title</div>
<div id="image">http://www.mpahmplakdjfe.co.uk/images/01.jpg</div>
<div id="fullcontent">some text here</div>';

$m = array();
preg_match('|<[^>]*title[^>]*>(.*?)<|', $html, $m);
// inner text is in $m[1]
echo $m[1]; // == 'My Title'


If you want to get all inner text from the string, use preg_match_all() instead of preg_match():

// say you have that string
$html = '<div id="fullcontent"><div>hi</div><div>hello</div></div>';

$m = array();
preg_match_all('|>(?<innerText>[^<]*)<|', $html, $m);
echo count($m['innerText']); // 2     ;how many matches
echo $m['innerText'][0];     // == 'hi'
echo $m['innerText'][1];     // == 'hello'

phpfiddle - http://x.co/6lbC6


If you absolutely want inner texts only from <div>s, then you can modify preg_match_all() above like this:

preg_match_all('|<div[^>]*>(?<innerText>[^<]+)<|', $html, $m);

Upvotes: 1

Martyn Shutt
Martyn Shutt

Reputation: 1706

    $subject = "<div id=\"image\">http://www.mpahmplakdjfe.co.uk/images/01.jpg</div>";

    preg_match("/<div id=\".*\">(.*)<\/div>/", $subject, $matches);

    print_r($matches[1]);

To understand the regex used in more detail:

https://regex101.com/r/tN6mD8/1

Regular expressions can look a little confusing in PHP as double-quotations have to be escaped. I always write mine in a separate editor first.

Edit: to get a specific tag:

    $subject = '<div id="image">http://www.mpahmplakdjfe.co.uk/images/01.jpg</div>';
    $title = '"image"';

    preg_match("/<div id=".$title.">(.*)<\/div>/", $subject, $matches);

Upvotes: 0

Related Questions