CaribouCode
CaribouCode

Reputation: 14438

Regex for preg_replace HTML comments and empty tags in SVG files

When Illustrator exports SVG files, it doesn't do a very good job of optimizing them. One annoying and pointless thing it puts in near the top of the file is the following HTML comment:

<!-- Generator: Adobe Illustrator 17.1.0, SVG Export Plug-In . SVG Version: 6.00 Build 0)  -->

I also have multiple empty group tags with ids like so:

<g id="svgSteps">
</g>
<g id="svgBase">
</g>

Now, I'm trying to write some PHP using regex and preg_replace to remove things like this. I'm completely new to regex and already tried some solutions posted on stackoverflow which didn't work for me.

For the HTML comments I tried:

$fileContent = file_get_contents('my_file');
$fileContent = preg_replace('/<!--(.|\s)*?-->/','',$fileContent);
file_put_contents('my_file',$fileContent);

Which didn't work. When I tried a str_replace for <!-- instead, that worked so I know the file_get_contents and file_put_contents are working (no issues with permissions).

What would be the correct regex for:

  1. Finding HTML comments starting with <!-- and ending with --> that have whitespace, alpha-numeric characters, periods, commas, colons and brackets inside.

  2. Finding tags starting with <g and ending with </g> that can have an id but only have either whitespace or nothing inside the tag.

Upvotes: 1

Views: 1252

Answers (3)

Alsace
Alsace

Reputation: 73

This seems to work for me:

<?php
$fileContent = '<!-- Generator: Adobe Illustrator 17.1.0, SVG Export Plug-In . SVG Version: 6.00 Build 0)  --> asdlfhjlkasdjhfasdf asd <g id="kjkjkh" /> askdjghf ag <g id="eeee" > </g>ahsdjghakjhglkjdahlg';
$fileContent = preg_replace('/(<\!--(.|\s)*-->)?([\n\w\W]*)?/','$3',$fileContent);
$fileContent = preg_replace('/<[gG]?\s+[Ii][Dd]="?\w+"\s*(?:\/>|>)[\s\t]*(<\/[gG]>)?/', ' ',$fileContent);
echo($fileContent);
?>

Upvotes: 1

xd6_
xd6_

Reputation: 483

try

    preg_replace("/((<g id=\".*\">)|(<g>))[\s]*(<\/g>)/",'',$fileContent)
    preg_replace("/(<!--)[\s\S]*(-->)/",'',$fileContent)

Upvotes: 3

Rafał Walczak
Rafał Walczak

Reputation: 543

Try this:

$fileContent = preg_replace('#<!--.*?-->#s', '', $fileContent);
$fileContent = preg_replace('#<(\w+)(?:\s+[^>]+)?>\s*</\1>#s', '', $fileContent);

I made it in two separate preg_replace instructions, so tags containing only comments will also be removed.

Upvotes: 1

Related Questions