Reputation: 33
I want to replace all Java-style comments (/* */) with the number of new lines for that comment. So far, I can only come up with something that replaces comments with an empty string
String.replaceAll("/\\*[\\s\\S]*?\\*/", "")
Is it possible to replace the matching regexes instead with the number of new lines it contains? If this is not possible with just regex matching, what's the best way for it to be done?
For example,
/* This comment
has 2 new lines
contained within */
will be replaced with a string of just 2 new lines.
Upvotes: 2
Views: 313
Reputation:
Since Java supports the \G
construct, just do it all in one go.
Use a global regex replace function.
Find
"/(?:\\/\\*(?=[\\S\\s]*?\\*\\/)|(?<!\\*\\/)(?!^)\\G)(?:(?!\\r?\\n|\\*\\/).)*((?:\\r?\\n)?)(?:\\*\\/)?/"
Replace
"$1"
https://regex101.com/r/l1VraO/1
Expanded
(?:
/ \*
(?= [\S\s]*? \* / )
|
(?<! \* / )
(?! ^ )
\G
)
(?:
(?! \r? \n | \* / )
.
)*
( # (1 start)
(?: \r? \n )?
) # (1 end)
(?: \* / )?
==================================================
==================================================
IF you should ever care about comment block delimiters started within
quoted strings like this
String comment = "/* this is a comment*/"
Here is a regex (addition) that parses the quoted string as well as the comment.
Still done in a single regex all at once in a global find / replace.
Find
"/(\"[^\"\\\\]*(?:\\\\[\\S\\s][^\"\\\\]*)*\")|(?:\\/\\*(?=[\\S\\s]*?\\*\\/)|(?<!\")(?<!\\*\\/)(?!^)\\G)(?:(?!\\r?\\n|\\*\\/).)*((?:\\r?\\n)?)(?:\\*\\/)?/"
Replace
"$1$2"
https://regex101.com/r/tUwuAI/1
Expanded
( # (1 start)
"
[^"\\]*
(?:
\\ [\S\s]
[^"\\]*
)*
"
) # (1 end)
|
(?:
/ \*
(?= [\S\s]*? \* / )
|
(?<! " )
(?<! \* / )
(?! ^ )
\G
)
(?:
(?! \r? \n | \* / )
.
)*
( # (2 start)
(?: \r? \n )?
) # (2 end)
(?: \* / )?
Upvotes: 1
Reputation: 159086
You can do it with a regex "replacement loop".
Most easily done in Java 9+:
String result = Pattern.compile("/\\*(?:[^*]++|\\*(?!/))*+\\*/").matcher(input)
.replaceAll(r -> r.group().replaceAll(".*", ""));
The main regex has been optimized for performance. The lambda has not been optimized.
For all Java versions:
Matcher m = Pattern.compile("/\\*(?:[^*]++|\\*(?!/))*+\\*/").matcher(input);
StringBuffer buf = new StringBuffer();
while (m.find())
m.appendReplacement(buf, m.group().replaceAll(".*", ""));
String result = m.appendTail(buf).toString();
Test
final String input = "Line 1\n"
+ "/* Inline comment */\n"
+ "Line 3\n"
+ "/* One-line\n"
+ " comment */\n"
+ "Line 6\n"
+ "/* This\n"
+ " comment\n"
+ " has\n"
+ " 4\n"
+ " lines */\n"
+ "Line 12";
Matcher m = Pattern.compile("(?s)/\\*(?:[^*]++|\\*(?!/))*+\\*/").matcher(input);
String result = m.replaceAll(r -> r.group().replaceAll(".*", ""));
// Show input/result side-by-side
String[] inLines = input.split("\n", -1);
String[] resLines = result.split("\n", -1);
int lineCount = Math.max(inLines.length, resLines.length);
System.out.println("input |result");
System.out.println("-------------------------+-------------------------");
for (int i = 0; i < lineCount; i++) {
System.out.printf("%-25s|%s%n", (i < inLines.length ? inLines[i] : ""),
(i < resLines.length ? resLines[i] : ""));
}
Output
input |result
-------------------------+-------------------------
Line 1 |Line 1
/* Inline comment */ |
Line 3 |Line 3
/* One-line |
comment */ |
Line 6 |Line 6
/* This |
comment |
has |
4 |
lines */ |
Line 12 |Line 12
Upvotes: 1
Reputation: 27723
Maybe, this expression,
\/\*.*?\*\/
on s
mode might be close to what you have in mind.
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class re{
public static void main(String[] args){
final String regex = "\\/\\*.*?\\*\\/";
final String string = "/* This comment\n"
+ "has 2 new lines\n"
+ "contained within */\n\n"
+ "Some codes here 1\n\n"
+ "/* This comment\n"
+ "has 2 new lines\n"
+ "contained within \n"
+ "*/\n\n\n"
+ "Some codes here 2";
final String subst = "\n\n";
final Pattern pattern = Pattern.compile(regex, Pattern.DOTALL);
final Matcher matcher = pattern.matcher(string);
final String result = matcher.replaceAll(subst);
System.out.println(result);
}
}
Some codes here 1
Some codes here 2
If you wish to explore/simplify/modify the expression, it's been explained on the top right panel of regex101.com. If you'd like, you can also watch in this link, how it would match against some sample inputs.
Upvotes: 0