MrTechie
MrTechie

Reputation: 1847

Figuring out proper preg_match regex matching

I am curling a page with php and then I am looking to find a section within that page. That section opens and closes with the html5 <section> tag like this:

<section id="postingbody">
   blah blah blah content
</section>

I am not sure how to get my matching working properly. Just to fill in the matching portion here:

preg_match("/ id=\"postingbody\"\">???????<\/section>/i", $compiled_results, $matches2);

Edit

So here is an example section of the content

<section id="postingbody">
    Looking to find a side job ( working your own hours ) or career in the new media field & internet marketing? Web design, graphic design, SEO, Printing & Internet marketing company looking to hire a sales team member. We have 10+ years experience in the Web design & marketing field. Work your own hours, competitive commission rates, we can also train the right candidates for sales. Our office is located in New Jersey.<br>
</section>

So the examples here don't seem to work.

Upvotes: 0

Views: 45

Answers (2)

elixenide
elixenide

Reputation: 44831

Try this:

preg_match("/(?s)<section id=\"postingbody\">((?:.)*?)<\/section>/i", $compiled_results, $matches2);

Regular expression visualization

Debuggex Demo

Edit: For example, the following code works as expected for me (the value is in $matches2):

$compiled_results = '<section id="postingbody">
    Looking to find a side job ( working your own hours ) or career in the new media field & internet marketing? Web design, graphic design, SEO, Printing & Internet marketing company looking to hire a sales team member. We have 10+ years experience in the Web design & marketing field. Work your own hours, competitive commission rates, we can also train the right candidates for sales. Our office is located in New Jersey.<br>
</section>';
preg_match("/(?s)<section id=\"postingbody\">((?:.)*?)<\/section>/i", $compiled_results, $matches2);
var_dump($matches2);

Upvotes: 2

anubhava
anubhava

Reputation: 785226

Regex is not always suited for this type of HTML/XML parsing. Better to use DOM parser in PHP.

However if you really have to then this regex should work for you with /s flag (DOTALL):

preg_match('# id="postingbody">.*?</section>#is', $compiled_results, $matches2);

Upvotes: 0

Related Questions