ted.goodridge
ted.goodridge

Reputation: 150

How can I find multi-line JavaScript comment blocks using a regular expression?

I'm trying to pull code comment blocks out of JavaScript files. I'm making a light code documentator.

An example would be:

/** @Method: setSize
 * @Description: setSize DESCRIPTION
 * @param: setSize PARAMETER
 */

I need to pull out the comments setup like this, ideally into an array.

I had gotten as far as this, but realize it may not handle new lines tabs, etc.:

\/\*\*(.*?)\*\/

(Okay, this seems like it would be simple, but I'm going in circles trying to get it to work.)

Upvotes: 4

Views: 2296

Answers (3)

rodneyrehm
rodneyrehm

Reputation: 13557

Depending on what you want to continue doing with the extracted docblocks, multiple approaches come to mind. If you simply need the docblocks without further references, String.match() may suffice. Otherwise you might need the index of the block.

As others have already pointed out, javascript's RegEx machine is everything but powerful. if you're used to PCRE, this feels like working with your hands tied behind your back. [\s\S] (space-character, non-space-character) is equivalent to dotAll - also capturing linebreaks.

This should get you started:

var string = 'var foo = "bar";'
    + '\n\n'
    + '/** @Method: setSize'
    + '\n * @Description: setSize DESCRIPTION'
    + '\n * @param: setSize PARAMETER'
    + '\n */'
    + '\n'
    + 'function setSize(setSize) { return true; }'
    + '\n\n'
    + '/** @Method: foo'
    + '\n * @Description: foo DESCRIPTION'
    + '\n * @param: bar PARAMETER'
    + '\n */'
    + '\n'
    + 'function foo(bar) { return true; }';

var docblock = /\/\*{2}([\s\S]+?)\*\//g,
    trim = function(string){ 
        return string.replace(/^\s+|\s+$/g, ''); 
    },
    split = function(string) {
        return string.split(/[\r\n]\s*\*\s+/);
    };

// extract all doc-blocks
console.log(string.match(docblock));

// extract all doc-blocks with access to character-index
var match;
while (match = docblock.exec(string)) {
    console.log(
        match.index + " characters from the beginning, found: ", 
        trim(match[1]), 
        split(match[1])
    );
}

Upvotes: 5

mfeineis
mfeineis

Reputation: 2657

What about some magic :)

comment.replace(/@(\w+)\s*\:\s*(\S+)\s+(\w+)/gim, function (match, tag, name, descr) {
    console.log(arguments);
    // Do sth. ...
});

I've not tested this so for the regex there is no guarantee, just to point you to a possibility do some RegExp-search the John Resig way 8-)

Upvotes: 0

punkrockbuddyholly
punkrockbuddyholly

Reputation: 9794

This should grab a comment block \/\*\*[^/]+\/. I don't think Regexp is the best way to generate an array from these blocks though. This regexp basically says:

Find a /** (the asterisk and forward slashes are escaped with \)

then find anything that isn't a /

then find one /

It's crude but is should generally work. Here's a live example http://regexr.com?300c6

Upvotes: 1

Related Questions