justDrink
justDrink

Reputation: 43

How to split a string without losing any word?

I am using Eclipse for Java and I want to split an input line without losing any characters.

For example, the input line is:

IPOD6 1 USD6IPHONE6 16G,64G,128G USD9,USD99,USD999MACAIR 2013-2014 USD123MACPRO 2013-2014,2014-2015 USD899,USD999

and the desired output is:

IPOD6 1 USD6
IPHONE6 16G,64G,128G USD9,USD99,USD999
MACAIR 2013-2014 USD123
MACPRO 2013-2014,2014-2015 USD899,USD999

I was using split("(?<=\\bUSD\\d{1,99}+)") but it doesn't work.

Upvotes: 1

Views: 93

Answers (2)

Avinash Raj
Avinash Raj

Reputation: 174696

You just need to add a non-word boundary \B inside the positive look-behind. \B matches between two non-word characters or between two word characters. It won't split on the boundary which exists between USD9 and comma in this USD9, substring because there is a word boundary exits between USD9 and comma since 9 is a word character and , is a non-word character. It splits on the boundary which exists between USD6 and IPHONE6 because there is a non-word boundary \B exists between those substrings since 6 is a word character and I is also a word character.

String s = "IPOD6 1 USD6IPHONE6 16G,64G,128G USD9,USD99,USD999MACAIR 2013-2014 USD123MACPRO 2013-2014,2014-2015 USD899,USD999";
String[] parts = s.split("(?<=\\bUSD\\d{1,99}+\\B)");
for(String i: parts)
{
    System.out.println(i);
}

Output:

IPOD6 1 USD6
IPHONE6 16G,64G,128G USD9,USD99,USD999
MACAIR 2013-2014 USD123
MACPRO 2013-2014,2014-2015 USD899,USD999

Upvotes: 1

alpha bravo
alpha bravo

Reputation: 7948

without making it too complicated, use this pattern

(?=IPOD|IPHONE|MAC)

and replace with new line
now it is easy to capture or split into an array
Demo


or maybe this pattern

((USD\d+,?)+)

and replace w/ $1\n
Demo

Upvotes: 1

Related Questions