user3422290
user3422290

Reputation: 283

Java string split with regular experssions

I am far from mastering regular expressions but I would like to split a string on first and last underscore e.g. split the string on first and last underscore with regular expression

"hello_5_9_2018_world" 

to

"hello"
"5_9_2018"
"world"

I can split it on the last underscore with

String[] splitArray = subjectString.split("_(?=[^_]*$)");

but I am not able to figure out how to split on first underscore.

Could anyone show me how I can do this?

Thanks David

Upvotes: 0

Views: 57

Answers (5)

Dmitrii Cheremisin
Dmitrii Cheremisin

Reputation: 1568

I see that a lot of guys provided their solution, but I have another regex pattern for your question

You can achieve your goal with this pattern:

"([a-zA-Z]+)_(.*)_([a-zA-Z]+)"

The whole code looks like this:

    String subjectString= "hello_5_9_2018_world";
    Pattern pattern = Pattern.compile("([a-zA-Z]+)_(.*)_([a-zA-Z]+)");
    Matcher matcher = pattern.matcher(subjectString);
    if(matcher.matches()){
        System.out.println(matcher.group(1));
        System.out.println(matcher.group(2));
        System.out.println(matcher.group(3));
    }

It outputs:

hello

5_9_2018

world

Upvotes: 1

Stefan Winkler
Stefan Winkler

Reputation: 3956

While the other answers are actually nicer and better, if you really want to use split, this is the way to go:

"hello_5_9_2018_world".split("((?<=^[^_]*)_)|(_(?=[^_]*$))")

==> String[3] { "hello", "5_9_2018", "world" }

This is a combination of your lookahead pattern (_(?=[^_]*$))
and the symmetrical look-behind pattern: ((?<=^[^_]*)_)
(match the _ preceeded by ^ (start of the string) and [^_]* (0..n non-underscore chars).

Upvotes: 0

Oleg Cherednik
Oleg Cherednik

Reputation: 18245

Regular Expression

(?<first>[^_]+)_(?<middle>.+)+_(?<last>[^_]+)

Demo

Java Code

final String str = "hello_5_9_2018_world";
Pattern pattern = Pattern.compile("(?<first>[^_]+)_(?<middle>.+)+_(?<last>[^_]+)");
Matcher matcher = pattern.matcher(str);

if(matcher.matches()) {
    String first = matcher.group("first");
    String middle = matcher.group("middle");
    String last = matcher.group("last");
}

Upvotes: 1

Thiyagu
Thiyagu

Reputation: 17900

You can achieve this without regex. You can achieve this by finding the first and last index of _ and getting substrings based on them.

String s = "hello_5_9_2018_world";

int firstIndex = s.indexOf("_");
int lastIndex = s.lastIndexOf("_");

System.out.println(s.substring(0, firstIndex));
System.out.println(s.substring(firstIndex + 1, lastIndex));
System.out.println(s.substring(lastIndex + 1));

The above prints

hello
5_9_2018
world

Note:

If the string does not have two _ you will get a StringIndexOutOfBoundsException.

To safeguard against it, you can check if the extracted indices are valid.

  • If firstIndex == lastIndex == -1 then it means the string does not have any underscores.

  • If firstIndex == lastIndex then the string has just one underscore.

Upvotes: 6

Stefan Winkler
Stefan Winkler

Reputation: 3956

If you have always three parts as above, you can use

([^_]*)_(.*)_(^_)*

and get the single elements as groups.

Upvotes: 1

Related Questions