David
David

Reputation: 1144

Link inside text in HTML purify

I have a link inside text:

$va="Some text http://www.stackoverflow.com?var=1&var2=2 more text"

When purify with this:

$config = HTMLPurifier_Config::createDefault();
$config->set('URI.MakeAbsolute', false);
$config->set('HTML.SafeObject', true);
$config->set('Output.FlashCompat', true);
$config->set('URI.AllowedSchemes',
        array (
                    'http' => true,
                    'https' => true,
                    'mailto' => true
                ));
$def = $config->getHTMLDefinition(true);
$def->addAttribute('a', 'target', 'Enum#_blank,_self,_target,_top');
$def->addAttribute('a', 'data-width', 'Text');
$def->addAttribute('a', 'data-height', 'Text');
$def->addAttribute('a', 'id', 'Text');
$def->addAttribute('a', 'name', 'Text');
$purifier = new HTMLPurifier($config);
$va = $purifier->purify($va);

Purify replace character & of the link for < how can i prevent this?

Upvotes: 0

Views: 253

Answers (2)

Edward Z. Yang
Edward Z. Yang

Reputation: 26762

When I run your code, I get the desired result:

<?php
ini_set('display_errors', TRUE);
error_reporting(E_ALL);

include_once 'library/HTMLPurifier.auto.php';

$raw = 'Some text http://www.stackoverflow.com?var=1&var2=2 more text';

$config = HTMLPurifier_Config::createDefault();
$config->set('URI.MakeAbsolute', false);
$config->set('HTML.SafeObject', true);
$config->set('Output.FlashCompat', true);
$config->set('URI.AllowedSchemes',
        array (
                    'http' => true,
                    'https' => true,
                    'mailto' => true
                ));
$def = $config->getHTMLDefinition(true);
$def->addAttribute('a', 'target', 'Enum#_blank,_self,_target,_top');
$def->addAttribute('a', 'data-width', 'Text');
$def->addAttribute('a', 'data-height', 'Text');
$def->addAttribute('a', 'id', 'Text');
$def->addAttribute('a', 'name', 'Text');
$purifier = new HTMLPurifier($config);

echo $purifier->purify($raw);

I get

Some text http://www.stackoverflow.com?var=1&amp;var2=2 more text

Notice that the ampersand has been properly escaped. It must be a bug elsewhere in your code.

Upvotes: 2

S.L&#246;we
S.L&#246;we

Reputation: 181

I didn't work with this library, but it's curious to me that you make a definition for the link ($def) and never set it on purifier.

Whitelisting the "<" character is not the right solution from my point of view. It should be handled by purifier if its configured in the right way.

Upvotes: 0

Related Questions