justDrink
justDrink

Reputation: 43

How to split a string without losing any word in Java?

I was using eclipse for Java.

I want to split an input line without losing any char.

For example input line is:

MAC 4 USD7MAIR 2014 USD1111IMAC 123 USD232MPRO 2-0-1-5

And the output should be:

MAC 4 USD7,MAIR 2014 USD1111,IMAC 123 USD232,MPRO 2-0-1-5

(If I split with "M" or etc. the char M itself will be removed.)

What should I do?

Upvotes: 3

Views: 186

Answers (1)

Avinash Raj
Avinash Raj

Reputation: 174696

You need to use a positive lookahead.

string.split("(?=M)");

OR

string.split("(?<!^)(?=M)");

Example:

String totalString = "MAC 4 USD7MAIR 2014 USD1111IMAC 123 USD232MPRO 2-0-1-5";
String[] parts = totalString.split("(?=M)");
System.out.println(Arrays.toString(parts));

Output:

[MAC 4 USD7, MAIR 2014 USD1111I, MAC 123 USD232, MPRO 2-0-1-5]

Update:

The below regex would split the input according to the boundary which exists immediate after to USD\d+, \d+ here means one or more digits.

String totalString = "MAC 4 USD7MAIR 2014 USD1111IMAC 123 USD232MPRO 2-0-1-5";
String[] parts = totalString.split("(?<=\\bUSD\\d{1,99}+)");
System.out.println(Arrays.toString(parts));

Output:

[MAC 4 USD7, MAIR 2014 USD1111, IMAC 123 USD232, MPRO 2-0-1-5]

(?<=...) called positive look-behind assertion. In languages which support variable length lookbehind (C#), you could use (?<=\\bUSD\\d+). But unfortunately java won't support variable length lookbehind. So we define the digits like allow \d{1,99} digits from 1 to 99 means lookafter to the USD+digits upto 99. And the + after the } called possessive quantifier which won't let the regex engine to backtrack, thus matching the largest possible value.

Upvotes: 5

Related Questions