BrainStone
BrainStone

Reputation: 3205

Regex has problems with poitive lookbehind

I am currently trying to create a regex that strips unecessary quotation marks from HTML tags. The regex will be used in PHP code.

<input type="image" src="/flags/en.png" alt="English" title="English" name="en" class="screen selected" />

converts to

<input type=image src="/flags/en.png" alt=English title=English name=en class="screen selected" />

I have come up with this regex and replacement:

/(?<=<(?:[^>]+?\s)?)([\w-]+=)"([\w-]+)"(?=(?:\s[^>]+)?>)/g
$1$2

The problem is that the positive lookbehind does not allow quantifiers (See http://regex101.com/ as a reference.).

So I thought I modify the pattern a little bit like this:

/(<(?:[^>]+?\s)?)([\w-]+=)"([\w-]+)"((?:\s[^>]+)?>)/g
$1$2$3$4

Now it's valid but it only strips one set of quotes from each tag.

How do I acomplish this?

Upvotes: 0

Views: 46

Answers (2)

Erlesand
Erlesand

Reputation: 1535

Probably won't save much, but here you go :)

 $string = '<input type="image" src="/flags/en.png" alt="English" title="English" name="en" class="screen selected" />'; 
 echo preg_replace('/="([a-z]+)"/i', '=$1', $string); 

Output:

<input type=image src="/flags/en.png" alt=English title=English name=en class="screen selected" />

Upvotes: 0

paolo
paolo

Reputation: 2538

Try the following:

$pattern = '/(<(?:[^>]+?\s)?)([\w-]+=)"([\w-]+)"((?:\s[^>]+)?>)/';
$replacement = '$1$2$3$4';
$subject = '<input type="image" src="/flags/en.png" alt="English" title="English" name="en" class="screen selected" />';

while(preg_match($pattern, $subject)){
    $subject = preg_replace($pattern, $replacement, $subject);
}
var_dump($subject);

Upvotes: 1

Related Questions