Reputation: 6551
I need to add spaces between all punctuation in a string.
\\ "Hello: World." -> "Hello : World ."
\\ "It's 9:00?" -> "It ' s 9 : 00 ?"
\\ "1.B,3.D!" -> "1 . B , 3 . D !"
I think a regex is the way to go, matching all non-punctuation [a-ZA-Z\\d]+
, adding a space before and/or after, then extracting the remainder matching all punctuation [^a-ZA-Z\\d]+
.
But I don't know how to (recursively?) call this regex. Looking at the first example, the regex will only match the "Hello"
. I was thinking of just building a new string by continuously removing and appending the first instance of the matched regex, while the original string is not empty.
private String addSpacesBeforePunctuation(String s) {
StringBuilder builder = new StringBuilder();
final String nonpunctuation = "[a-zA-Z\\d]+";
final String punctuation = "[^a-zA-Z\\d]+";
String found;
while (!s.isEmpty()) {
// regex stuff goes here
found = ???; // found group from respective regex goes here
builder.append(found);
builder.append(" ");
s = s.replaceFirst(found, "");
}
return builder.toString().trim();
}
However this doesn't feel like the right way to go... I think I'm over complicating things...
Upvotes: 3
Views: 1183
Reputation: 785481
You can use lookarounds based regex using punctuation property \p{Punct}
in Java:
str = str.replaceAll("(?<=\\S)(?:(?<=\\p{Punct})|(?=\\p{Punct}))(?=\\S)", " ");
(?<=\\S)
Asserts if prev char is not a white-space(?<=\\p{Punct})
asserts a position if previous char is a punctuation char(?=\\p{Punct})
asserts a position if next char is a punctuation char(?=\\S)
Asserts if next char is not a white-spaceUpvotes: 5
Reputation: 726809
When you see a punctuation mark, you have four possibilities:
Here is code that does the replacement properly:
String ss = s
.replaceAll("(?<=\\S)\\p{Punct}", " $0")
.replaceAll("\\p{Punct}(?=\\S)", "$0 ");
It uses two expressions - one matching the number 2, and one matching the number 3. Since the expressions are applied on top of each other, they take care of the number 4 as well. The number 1 requires no change.
Upvotes: 2