Reputation: 3
I can use string.split("\\W+")
to have words containing only characters.
However:
I don't want break down words such as "re-use" into "re" & "use".
And also words like "out-of-the-way" with multiple hyphens.
I want to break "and--oh" into "and" & "oh".
How can I possibly achieve that?
Upvotes: 0
Views: 408
Reputation: 11075
You can replace continuous hyphens to a special character firstly, and then do the simple regex split.
Please refer to the code below.
public class Test {
public static void main(String args[]){
String str = "This is^^some@@words-apple-banana--orange";
str = str.replaceAll("[-]{2,}", "@");
System.out.println(str);
String regex = "[^\\w-]+";
String arr[] = str.split(regex);
for(String item:arr){
System.out.println(item);
}
}
}
The result is:
This are^^some@@words-apple-banana@orange
This
are
some
words-apple-banana
orange
Upvotes: 1