Reputation: 97
How can i replace this <p><span class="headline">
with this <p class="headline"><span>
easiest with PHP.
$data = file_get_contents("http://www.ihr-apotheker.de/cs1.html");
$clean1 = strstr($data, '<p>');
$str = preg_replace('#(<a.*>).*?(</a>)#', '$1$2', $clean1);
$ausgabe = strip_tags($str, '<p>');
echo $ausgabe;
Before I alter the html from the site I want to get the class declaration from the span to the <p>
tag.
Upvotes: 2
Views: 876
Reputation: 602
Have you tried using str_replace
?
If the placement of the <p>
and <span>
tags are consistent, you can simply replace one for the other with
str_replace("replacement", "part to replace", $string);
Upvotes: 0
Reputation: 316979
Well, answer was accepted already, but anyway, here is how to do it with native DOM:
$dom = new DOMDocument;
$dom->loadHTMLFile("http://www.ihr-apotheker.de/cs1.html");
$xPath = new DOMXpath($dom);
// remove links but keep link text
foreach($xPath->query('//a') as $link) {
$link->parentNode->replaceChild(
$dom->createTextNode($link->nodeValue), $link);
}
// switch classes
foreach($xPath->query('//p/span[@class="headline"]') as $node) {
$node->removeAttribute('class');
$node->parentNode->setAttribute('class', 'headline');
}
echo $dom->saveHTML();
On a sidenote, HTML has elements for headings, so why not use a <h*>
element instead of using the semantically superfluous "headline" class.
Upvotes: 1
Reputation: 12217
The reason not to parse HTML with regex is if you can't guarantee the format. If you already know the format of the string, you don't have to worry about having a complete parser.
In your case, if you know that's the format, you can use str_replace
str_replace('<p><span class="headline">', '<p class="headline"><span>', $data);
Upvotes: 1
Reputation: 1754
dont parse html with regex! this class should provide what you need http://simplehtmldom.sourceforge.net/
Upvotes: 3