Cael
Cael

Reputation: 556

String splitting on 3 or more words

I have a code that will split 2 words in a string and put them in a array.

String words = "chill hit donkey chicken car roast pink rat tree";

into

[chill hit, donkey chicken, car roast, pink rat, tree]

This is my code for that:

  String[] result = joined.split("(?<!\\G\\S+)\\s");
  System.out.printf("%s%n", Arrays.toString(result));

Now, how do I modify the regex so that it will split into 3 or more words?

Output(3 word in an array):

 [chill hit donkey, chicken car roast, pink rat tree]

Output(4 word in an array):

[chill hit donkey chicken, car roast pink rat tree]

Tried to modify regex but nothing had worked this far. Thanks.

Upvotes: 0

Views: 91

Answers (4)

totoro
totoro

Reputation: 2456

Here is one another find() version – just change {3} to whatever number you like.

Regex demo

// ((?:\w+\W?){3})(?:(\W+|$))
String text = "chill hit donkey chicken car roast pink rat tree";
String regex = "((?:\\w+\\W?){3})(?:(\\W+|$))";
Matcher m = Pattern.compile(regex).matcher(text);
while (m.find()) {
    System.out.println(String.format("'%s'", m.group(1)));
}

Ideone.com

Out

'chill hit donkey'
'chicken car roast'
'pink rat tree'

Upvotes: 1

rock321987
rock321987

Reputation: 11032

You can use this regex(using re.find())

((?:\w+\s){2}(?:\w+)) (Replace `2` with `3` for 4 words)

Regex Demo

Java Code

String line = "chill hit donkey chicken car roast pink rat tree";
String pattern = "((?:\\w+\\s){2}(?:\\w+))";
Pattern r = Pattern.compile(pattern);
Matcher m = r.matcher(line);

while (m.find()) {
    System.out.println(m.group(1));
}

Ideone Demo

Upvotes: 1

MicroSystem
MicroSystem

Reputation: 23

for the splitting the text to group of N we can use this

((?:\w+\s){N-1}(?:\w+)) where for group of 2 items you use ((?:\w+\s){1}(?:\w+))

and for group of 3 items use ((?:\w+\s){2}(?:\w+)) and so on.

Upvotes: 1

Daniel Widdis
Daniel Widdis

Reputation: 9091

Just add the appropriate additional number of "nonwhitespace+whitespace" combinations:

joined.split("(?<!\\G\\S+\\s+\\S+)\\s");

You can group the \S+\s+ together if they get larger than this...`

joined.split("(?<!\\G(\\S+\\s+){2}\\S+)\\s"); for 4 words, etc.

Upvotes: 0

Related Questions