Jules
Jules

Reputation: 2021

CFML RegEx to remove javascript comments

I am looking to remove javascript comments from a string using coldfusion. I am currently using reReplace(string, "(\/\*.*\*\/)|\s(\/\/.{1,}[\r\n])", "", "all").

This is a test string:

<script type="text/javascript">
// comment
var a=1; // another comment
/* try{if (...)}; */
var b=2;
</script>
src="//domain.com"

The expected result is (and what I get using replace() in javacript):

<script type="text/javascript">
var a=1; 
var b=2;
</script>
src="//domain.com"

Actual CFML results:

<script type="text/javascript">
src="//domain.com"

Again, it works in javascript OK.

How to get this working in CFML?


UPDATE 1, more specific code in my app. It's basically a minifier within app.cfc's OnRequest() function.

  1. Get the page html
  2. Remove both types of JS comments
  3. Flatten \r\n to \r
  4. Replace \n+\t to a space
  5. Replace \t to a space
  6. Replace double spaces with a single space
  7. Replace double \r with a single \r
  8. Replace comma+\r with a comma

    <!--- Define arguments. --->
    <cfargument
        name="TargetPage"
        type="string"
        required="true"
        />
    
    <cfheader name="content-type" value="text/html; charset=utf-8" />
    <cfheader name="X-UA-Compatible" value="IE=edge" />
    <cfheader name="window-target" value="_top" />
    <cfheader name="imagetoolbar" value="no" />
    <cfheader name="viewport" value="wwidth=device-width, initial-scale=1, maximum-scale=1, user-scalable=0" />
    
    <cfsavecontent variable="finalContent">
    <cfinclude template="#ARGUMENTS.TargetPage#" />
    </cfsavecontent>
    
    <cfset variables.regex = '(?:("\/\/[^"]*?")|\/\*.*?\*\/|\/\/.*?\n)'>
    <!--- <cfset finalContent = reReplace(finalContent,variables.regex, "\1", "ALL")> --->
    <cfset finalContent = replace(finalContent,  chr(13), chr(10), 'all')>
    <cfset finalContent = replace(finalContent,  chr(10)&chr(9), ' ', 'all')>
    <cfset finalContent = replace(finalContent,  chr(9), ' ', 'all')>
    <cfloop from="1" to="20" index="e">
        <cfset finalContent = replace(finalContent, '  ', ' ', 'all')>
        <cfset finalContent = replace(finalContent, chr(10)&chr(10), chr(10), 'all')>
    </cfloop>
    <cfset finalContent = replace(finalContent,  ','&chr(10), ',', 'all')>
    <cfset finalContent = replace(finalContent,  chr(10), '', 'all')>
    
    <cfoutput>#finalContent#</cfoutput>
    
    <cfreturn />
    

And some true (but truncated) output to play with:

<script src="//code.jquery.com/jquery-2.1.4.min.js"></script>
<script type="text/javascript">
//<![CDATA[
try{if (...) {...;}} catch(e){};
//]]>
// comment
var a=1; // another comment
/* try{if (...)}; */
var b=2;
</script>
<script type="text/javascript">
 unsavedChanges=0;
 tinymce.init({
     // GENERAL
     // PLUGINS
     // LINK
     link_list: "/pagesJSON.cfm", target_list: [
         {title: 'Same Window/Tab', value: '_self'}, {title: 'New Window/Tab', value: '_blank'}
     ],
     // FILE MANAGER
     external_filemanager_path: '/filemanager/',
     // IMAGE
     image_advtab: true
 });
 </script>
<link rel='stylesheet' href='https://fonts.googleapis.com/css?family=Lato%3A400%2C700%2C900&#038;ver=4.3.1' type='text/css' media='all'/>

Upvotes: 0

Views: 353

Answers (3)

Cœur
Cœur

Reputation: 38667

Solution by OP.

The correct reReplace is:

reReplace(finalContent, '\/\*.*?\*\/|\s(\/\/.*?\r\n)', "", "ALL")

Makes the below output. Still needs some cleaning but links and js functions don't break!

<script src="//code.jquery.com/jquery-2.1.4.min.js"></script>
    <script type="text/javascript">
       try{if (...) {...;}} catch(e){};
          var a=1;    
    var b=2;
    </script>
    <script type="text/javascript">
     unsavedChanges=0;
     tinymce.init({
                                 link_list: "/pagesJSON.cfm", target_list: [
             {title: 'Same Window/Tab', value: '_self'}, {title: 'New Window/Tab', value: '_blank'}
         ],
                 external_filemanager_path: '/filemanager/',
                 image_advtab: true
     });
     </script>
    <link rel='stylesheet' href='https://fonts.googleapis.com/css?family=Lato%3A400%2C700%2C900&#038;ver=4.3.1' type='text/css' media='all'/> 

Upvotes: 0

David R
David R

Reputation: 15647

The regular expression pattern which you are currently trying seems to be incorrect, (I have tried validating it with the "Online RegEx Tester" and confirmed).

You need to rewrite it as,

\/\*[\s\S]*?\*\/|([^:"]|^)\/\/.*$

Here is the screenshot from, https://regex101.com/#javascript (Where I had tested the above pattern)

enter image description here

Try using the \/\*[\s\S]*?\*\/|([^:"]|^)\/\/.*$ in your reReplace function which will workout for you.

Hope this helps you!

Upvotes: -1

Abhishekh Gupta
Abhishekh Gupta

Reputation: 6236

You can try this:

<!--- JS with comment --->
<cfsavecontent variable="variables.jsWithCommment">
    <script type="text/javascript">
    // comment
    var a=1; // another comment
    /* try{if (...)}; */
    var b=2;
    </script>
    src="//domain.com"
</cfsavecontent>

<!--- Replace with first capture for each branch --->
<cfset variables.regex = '(?:("\/\/[^"]*?")|\/\*.*?\*\/|\/\/.*?\n)'>
<cfset variables.jsWithoutComment = reReplace(variables.jsWithCommment, variables.regex, "\1", "ALL")>

Regex:

Branch 1: ("\/\/[^"]*?")  ==> Capture(to replace with same later i.e., \1) URL shorthand
Branch 2: \/\*.*?\*\/     ==> MultiLine Comment
Branch 3: \/\/.*?\n       ==> SingleLine Comment

Here is the TryCF.

Upvotes: 5

Related Questions