BaltzarKeyboard
BaltzarKeyboard

Reputation: 3

Regular expression for hyphens mixed words

I can use string.split("\\W+") to have words containing only characters.

However:

  1. I don't want break down words such as "re-use" into "re" & "use".
    And also words like "out-of-the-way" with multiple hyphens.

  2. I want to break "and--oh" into "and" & "oh".

How can I possibly achieve that?

Upvotes: 0

Views: 408

Answers (2)

Eugene
Eugene

Reputation: 11075

You can replace continuous hyphens to a special character firstly, and then do the simple regex split.

Please refer to the code below.

public class Test {
    public static void main(String args[]){
        String str = "This is^^some@@words-apple-banana--orange";
        str = str.replaceAll("[-]{2,}", "@");
        System.out.println(str);
        String regex = "[^\\w-]+";
        String arr[] = str.split(regex);
        for(String item:arr){
            System.out.println(item);
        }
    }
}

The result is:

This are^^some@@words-apple-banana@orange
This
are
some
words-apple-banana
orange

Upvotes: 1

Ibrahim
Ibrahim

Reputation: 6088

Try this Regex:

string.split("[^\\w\\-]+|--+")

Upvotes: 2

Related Questions