Reputation: 163
I have an html file and I want to get all the classes from this file in an array using PHP. For example this is my html file:
<div class="main menu">element</div>
<div class="content"></div>
I want to get an array with three elements (in this particular example): "main", "menu", "content".
In bash it is possible to use grep to accomplish this:
classes=($(grep -oP '(?<=class=").*?(?=")' "./index.html"))
How can I do the same in PHP?
I have this basic code at this moment:
//read the entire string
$str = implode("", file('./index.html'));
$fp = fopen('./index.html', 'w');
//Here I guess should be the function to get all of the strings
//now, save the file
fwrite($fp, $str, strlen($str));
Edit: How can my question be the duplicate of the one provided, if I am asking on how find the string using PHP? It is not bash and I have already provided the grep alternative.
Upvotes: 1
Views: 208
Reputation: 12389
To get the three elements, try regex like this with preg_match_all function:
(?:class="|\G(?!^))\s*\K[^\s"]+
\G
continues at end of the previous match or start\K
resets beginning of the reported matchSee test at eval.in
if(preg_match_all('/(?:class="|\G(?!^))\s*\K[^\s"]+/', $str, $out) > 0)
print_r($out[0]);
Array ( [0] => main [1] => menu [2] => content )
Note that generally regex is not the appropriate means for parsing html. depends if parsing own or arbitrary html and what going to achieve imho.
Upvotes: 4
Reputation: 22959
I would use php's DOMDocument()
class like this:
$classes = array();
libxml_use_internal_errors(true);
$dom = new DOMDocument();
$dom->loadHTMLFile('./index.html');
$elements = $dom->getElementsByTagName('*');
foreach($elements as $element) {
$classes = array_merge($classes,array_filter(explode(' ',$element->getAttribute('class'))));
}
print_r($classes);
Explanation:
$classes
DOMDocument
objectindex.html
into DOMDocument
$classes
arrayUpvotes: 4
Reputation: 4854
Depending on what you're trying to do, you can either use regular expressions using the preg_grep function, or you could traverse the DOM using the DOMDocument class.
Upvotes: 1