Reputation: 2585
I am working on a personal project where I need to extract the actual comments from an input string like this.
Case 1: /* Some useful text */
Output: Some useful text
Case 2: /*** This is formatted obnoxiously**/
Output: This is formatted obnoxiously
Case 3:
/**
More useful
information
*/
Output: More useful information
Case 4:
/**
Prompt the user to type in
the number. Assign the number to v
*/
Output: Prompt the user to type in the number. Assign the number to v
I am working in Java and I have tried to replace /*
and */
using naive method such as String.replace
but since a comment can be formatted in different ways like above, the replace
method seems not to be a viable approach to do this. How can I achieve the above outputs using regex?
Here is the test comment file that I am using.
Upvotes: 0
Views: 2635
Reputation: 46209
You can use the following regex:
String newString = oldString.replaceAll("/\\*+\\s*|\\s*\\*+/", "");
EDIT
To also get rid of newlines you could do something like:
String regex = "/\\*+\\s*|\\s*\\*+/|[\r\n]+";
String newString = oldString.replaceAll(regex, "");
Upvotes: 1
Reputation: 8090
Try something like :
"/\\*+\\s*(.*?)\\*+/"
And dot should match also new lines:
Pattern p = Pattern.compile("/\\*+\\s*(.*?)\\*+/", Pattern.DOTALL);
EDIT
Pattern p = Pattern.compile("/\\*+\\s*(.*?)\\*+/", Pattern.DOTALL);
Matcher m = p.matcher("/*** This is formatted obnoxiously**/");
m.find();
String sanitizedComment = m.group(1);
System.out.println(sanitizedComment);
Upvotes: 2