Sangoku
Sangoku

Reputation: 1606

Conditional string replace

I have a funcion which is somthing like this :

function replaceXMLEntities($subject) {
 $subject = str_replace("&", "&", $subject);
 $subject = str_replace("'", "'", $subject);
 $subject = str_replace("<", "&lt;", $subject);
 $subject = str_replace(">", "&gt;", $subject);
 $subject = str_replace("\"", "&quot;", $subject);

 return $subject;
}

This function is used to convert strings to a safe string for xmpl encoding.

But i have a problem in which casses some xmpl data ges encoded 2 times, like

&amp;

as imput gets to

&&amp;

Just like in here when you enter the text without code quote :)

I need a regex which could distinguish between & and & somthing like

if not &amp then do & -> &amp; conversion else dont touch it. 

Any idea how i could achive such regex? I could go and make a funciton but in this case a regex is clearly a better option.

Upvotes: 1

Views: 492

Answers (4)

HamZa
HamZa

Reputation: 14921

Pretty simple using preg_replace()regex:
$subject = preg_replace('&(?!amp;)', '', $subject);

&: match &                                                (?!amp;): Negative lookahead, check if there is no amp;

We'll still use str_replace() for the other characters, but note that it supports several inputs/replacements, so our final code will be:

function replaceXMLEntities($subject){
    $subject = preg_replace('&(?!amp;)', '', $subject);
    $subject = str_replace(array("'", '<', '>', '"'), array('&apos;', '&lt;', '&gt;', '&quot;'), $subject);

    return $subject;
}

You could also use a tricky way, first replace all &amp; with & and then replace all & with &amp;:

function replaceXMLEntities($subject){
    return str_replace(array('&amp;', '&', "'", '<', '>', '"'), array('&', '&amp;', '&apos;', '&lt;', '&gt;', '&quot;'), $subject);
}

Upvotes: 0

Hardcore way:

preg_replace('/&(?!#?[a-z0-9]+;)/', '&amp;', '&amp; & &lt; &gt;');

Easy and right way is using htmlspecialchars().

Upvotes: 0

arychj
arychj

Reputation: 711

$subject = preg_replace('#(?!&amp;)&#', '&amp;', $subject);

Though using htmlspecialchars() might be easier...

Upvotes: 1

Mixthos
Mixthos

Reputation: 1087

You could achieve the same without a regular expression by replacing all &amps with & first:

$subject = str_replace("&amp;", "&", $subject);

Upvotes: 1

Related Questions