Gary Hillerson
Gary Hillerson

Reputation: 131

Replacing link with plain text with php simple html dom

I have a program that removes certain pages from a web; i want to then traverse the remaining pages and "unlink" any links to those removed pages. I'm using simplehtmldom. My function takes a source page ($source) and an array of pages ($skipList). It finds the links, and I'd like to then manipulate the dom to convert the element into the $link->innertext, but I don't know how. Any help?

function RemoveSpecificLinks($source, $skipList) {
    // $source is the html source file; 
    // $skipList is an array of link destinations (hrefs) that we want unlinked
$docHtml    = file_get_contents($source);
$htmlObj    = str_get_html($docHtml);
$links  = $htmlObj->find('a');
if (isset($links)) {
    foreach ($links as $link) {
        if (in_array($link->href, $skipList)) {
            $link->href = ''; // Should convert to simple text element
        }
    }
}
$docHtml    = $htmlObj->save(); 
$htmlObj->clear();
unset($htmlObj);
return($docHtml);
}

Upvotes: 0

Views: 2042

Answers (1)

George
George

Reputation: 628

I have never used simplehtmldom, but this is what I think should solve your problem:

function RemoveSpecificLinks($source, $skipList) {
    // $source is the HTML source file; 
    // $skipList is an array of link destinations (hrefs) that we want unlinked
$docHtml    = file_get_contents($source);
$htmlObj    = str_get_html($docHtml);
$links  = $htmlObj->find('a');
if (isset($links)) {
    foreach ($links as $link) {
        if (in_array($link->href, $skipList)) {

            $link->outertext = $link->plaintext; // THIS SHOULD WORK

            // IF THIS DOES NOT WORK TRY:
            // $link->outertext = $link->innertext;
        }
    }
}
$docHtml    = $htmlObj->save(); 
$htmlObj->clear();
unset($htmlObj);
return($docHtml);
}

Please provide me some feedback as if this worked or not, also specifying which method worked, if any.

Update: Maybe you would prefer this:

$link->outertext = $link->href;

This way you get the link displayed, but not clickable.

Upvotes: 1

Related Questions