BullyWiiPlaza
BullyWiiPlaza

Reputation: 19185

Removing all multi-line comments

I researched for a while but surprisingly none of the methods or regular expressions I found worked properly.

I need a method that removes all kinds of single and multi-line comments from a source code file.

Various regular expressions such as

sourceCode.replaceAll("(/\\*([^*]|[\\r\\n]|(\\*+([^*/]|[\\r\\n])))*\\*+/|[ \\t]*//.*)", "");

I tried resulted in an exception:

Exception in thread "main" java.lang.StackOverflowError

Then I also found solutions such as this one which worked well but still had a few comment characters floating around in the processed source code which shouldn't happen.

Another method such as this one worked almost perfectly but it failed with comments of the form /*// Hi */ and totally ignored those blocks.

I literally got a different result from each regex I tried. Let me know please how to reliably accomplish this task.

Upvotes: 2

Views: 392

Answers (1)

Lucas Trzesniewski
Lucas Trzesniewski

Reputation: 51330

Here's a simplified version from my answer on JavaScript comment removal:

Replace:

(?m)((["'])(?:\\.|.)*?\2)|//.*?$|/\*[\s\S]*?\*/

With $1.

Demo here

The answer I linked to explains in detail how this pattern works. The reason this one is simpler is because Java doesn't have regex literals in the language syntax. Those really make the replacement nasty.

Upvotes: 2

Related Questions