Reputation: 43
I am using Eclipse for Java and I want to split an input line without losing any characters.
For example, the input line is:
IPOD6 1 USD6IPHONE6 16G,64G,128G USD9,USD99,USD999MACAIR 2013-2014 USD123MACPRO 2013-2014,2014-2015 USD899,USD999
and the desired output is:
IPOD6 1 USD6
IPHONE6 16G,64G,128G USD9,USD99,USD999
MACAIR 2013-2014 USD123
MACPRO 2013-2014,2014-2015 USD899,USD999
I was using split("(?<=\\bUSD\\d{1,99}+)")
but it doesn't work.
Upvotes: 1
Views: 93
Reputation: 174696
You just need to add a non-word boundary \B
inside the positive look-behind. \B
matches between two non-word characters or between two word characters. It won't split on the boundary which exists between USD9
and comma in this USD9,
substring because there is a word boundary exits between USD9
and comma since 9 is a word character and ,
is a non-word character. It splits on the boundary which exists between USD6
and IPHONE6
because there is a non-word boundary \B
exists between those substrings since 6
is a word character and I
is also a word character.
String s = "IPOD6 1 USD6IPHONE6 16G,64G,128G USD9,USD99,USD999MACAIR 2013-2014 USD123MACPRO 2013-2014,2014-2015 USD899,USD999";
String[] parts = s.split("(?<=\\bUSD\\d{1,99}+\\B)");
for(String i: parts)
{
System.out.println(i);
}
Output:
IPOD6 1 USD6
IPHONE6 16G,64G,128G USD9,USD99,USD999
MACAIR 2013-2014 USD123
MACPRO 2013-2014,2014-2015 USD899,USD999
Upvotes: 1
Reputation: 7948
without making it too complicated, use this pattern
(?=IPOD|IPHONE|MAC)
and replace with new line
now it is easy to capture or split into an array
Demo
or maybe this pattern
((USD\d+,?)+)
and replace w/ $1\n
Demo
Upvotes: 1