Reputation: 15506
For the pure purpose of obfuscation, the first three lines seem to clean up the script pretty nicely from unnecessary enters.
Can anyone tell me what the lines 1 - 4 actually do? Only thing I know from trial and error is that if I comment out the fourth line the site works, if I leave it in place the site breaks.
<?php
header("Content-type: text/javascript; charset=UTF-8");
ob_start("compress");
function compress($buffer)
{
# remove extra or unneccessary new line from javascript
$buffer = preg_replace('/([;])\s+/', '$1', $buffer);
$buffer = preg_replace('/([}])\s+(else)/', '$1else', $buffer);
$buffer = preg_replace('/([}])\s+(var)/', '$1;var', $buffer);
$buffer = preg_replace('/([{};])\s+(\$)/', '$1\$', $buffer);
return $buffer;
}
Is there a better way to remove one or multiple line enters from JavaScript?
Upvotes: 1
Views: 1518
Reputation: 76646
Let's try and dissect each one of the regular expressions.
First regex
$buffer = preg_replace('/([;])\s+/', '$1', $buffer);
Explanation
( # beginning of the first capturing group
[;] # match the literal character ';'
) # ending of the first capturing group
\s+ # one or more whitespace characters (including newlines)
The above regular expression removes any whitespace that occurs immediately following a semicolon. ([;])
is a capturing group, meaning if a match is found, it is stored into a backreference, so we could use it later. For example, if our string was foo; <space><space>
, then the expression would match ;
and the whitespace characters. The replacement pattern here is $1
, which means the entire matched string would be replaced with just a semicolon.
Second regex
$buffer = preg_replace('/([}])\s+(else)/', '$1else', $buffer);
Explanation
( # beginning of the first capturing group
[}] # match the literal character ';'
) # ending of the first capturing group
\s+ # one or more whitespace characters
(else) # match and capture 'else'
The above regex removes any whitespace between a closing curly brace (}
) and else
. The replacement pattern here is $1else
, which means, the string with whitespace will get replaced by what was captured by the first capturing group ([}])
(which is just the semicolon) followed by the keyword else
. Nothing much to it.
Third regex
$buffer = preg_replace('/([}])\s+(var)/', '$1;var', $buffer);
Explanation
( # beginning of the first capturing group
[}] # match the literal character ';'
) # ending of the first capturing group
\s+ # one or more whitespace characters
(var) # match and capture 'var'
This is the same as previous regex. The only difference here is the keyword - var
instead of else
. The semicolon character is optional in JavaScript. But if you want to write multiple statements in a single line, there's no way for the interpreter to know they're multiple lines, so a ;
will need to be used to terminate each statement.
Fourth regex
$buffer = preg_replace('/([{};])\s+(\$)/', '$1\$', $buffer);
Explanation
( # beginning of the first capturing group
[{};] # match the literal character '{' or '}' or ';'
) # ending of the first capturing group
\s+ # one or more whitespace characters
( # beginning of the second capturing group
\$ # match the literal character '$'
) # ending of the second capturing group
The replacement pattern here is $1\$
, which means the entire matched string would be replaced with what was matched by the first capturing group ([{};])
followed by a literal $
character.
This answer was only meant to explain the four regexes and what it does. The expressions could be improved a lot, but I'm not going into that as it's not the correct approach. As Qtax points out in the comments, you really should use a proper JS minifier to achieve this task. You might want to check out Google's Closure Compiler - it looks pretty neat.
If you're still confused how it works, don't worry. Learning regexes can be difficult in the beginning. I suggest you use this website - http://regularexpressions.info. It is a pretty decent resource for learning regular expressions. If you're looking for a book, you might want to check out Mastering Regular Expressions By Jeffrey Friedl.
Upvotes: 5