Jochen
Jochen

Reputation: 1776

Remove HTML and Javascript comments automatically

I want to remove HTML and JavaScript comments automatically. I am using ant-scripts for deployment and JSF on the server. What options or tools are available? Thanks in advance.

Upvotes: 2

Views: 7278

Answers (4)

vitaly-t
vitaly-t

Reputation: 25840

Library decomment does exactly what you described - removes comments from JSON, JavaScript, CSS, HTML, etc.

For use within the gulp system see gulp-decomment

Upvotes: 1

aMarCruz
aMarCruz

Reputation: 2852

Replacing comments in files that mix HTML and JavaScript with regexes is risky. However, separately, you can do with good performance without relying on external tools, only node.js:

For HTML comments use the regex /<!--(?!>)[\S\s]*?-->/g. example:

function stripHtmlComments(content) {
  return content.replace(/<!--(?!>)[\S\s]*?-->/g, '');
}

Removing JavaScript comments is a bit more complex, you need mix several regexes to differentiate when comments are inside literal strings or regexes, and when a slash belongs to a regex :)

This tiny program removes both multiline and single-line comments from JavaScript files:

#!/usr/bin/env node
/*
  Removes multiline and single-line comments from a JavaScript source file.
  Author: aMarCruz - https://github.com/aMarCruz
  Usage: node [this-tool] [js-file]
*/
var path = require('path'),
    fs = require('fs'),
    file,
    str;

var RE_BLOCKS = new RegExp([
    /\/(\*)[^*]*\*+(?:[^*\/][^*]*\*+)*\//.source,           // $1: multi-line comment
    /\/(\/)[^\n]*$/.source,                                 // $2 single-line comment
    /"(?:[^"\\]*|\\[\S\s])*"|'(?:[^'\\]*|\\[\S\s])*'/.source, // string, don't care about embedded eols
    /(?:[$\w\)\]]|\+\+|--)\s*\/(?![*\/])/.source,           // division operator
    /\/(?=[^*\/])[^[/\\]*(?:(?:\[(?:\\.|[^\]\\]*)*\]|\\.)[^[/\\]*)*?\/[gim]*/.source
    ].join('|'),                                            // regex
    'gm'  // note: global+multiline with replace() need test
    );

file = process.argv[2];
if (!path.extname(file))
    file += '.js';
str = fs.readFileSync(file, { encoding: 'utf8' });

console.log(stripJSComments(str));

// remove comments, keep other blocks
function stripJSComments(str) {
    return str.replace(RE_BLOCKS, function (match, mlc, slc) {
        return mlc ? ' ' :     // multiline comment (must be replaced with one space)
               slc ? '' :      // single-line comment
               match;          // divisor, regex, or string, return as-is
        });
}

Now (example) save as rcomms and run with:

node rcomms source-file > clean-file.js

NOTE: This code is based on regexes from jspreproc, if you need more advanced processing, please visit http://github.com/aMarCruz/jspreproc.

I wrote jspreproc to deploy some riot modules. jspreproc remove empty lines, supports filters for preserve some comments and conditional comments in C-style: #if-else,endif, #define, #include, etc.

Upvotes: 4

Licson
Licson

Reputation: 2271

You can use regular expressions to remove them with ease. For example, you can remove HTML comments by replace the matches of the regular expression /\<!--(.*)-\>/gi to nothing.

Upvotes: 1

tbraun89
tbraun89

Reputation: 2234

Make a new target and use replaceregexp to replace all comments and other things you dont want in these files.

You could do sth. like that for html and something similar for js:

<target name="-trim.html.comments">

    <fileset id="html.fileset"
        dir="${build.dir}"
        includes="**/*.jsp, **/*.php, **/*.html"/>

    <!-- HTML Comments -->
    <replaceregexp replace="" flags="g"
        match="\&lt;![ \r\n\t]*(--([^\-]|[\r\n]|-[^\-])*--[ \r\n\t]*)\&gt;">
        <fileset refid="html.fileset"/>
    </replaceregexp>

</target>

Source: http://www.julienlecomte.net/blog/2007/09/23/

Upvotes: 0

Related Questions