Prakhar
Prakhar

Reputation: 536

Split a string using regex into 3 parts

I have a string like this

DATA/2019-00-01-23.x

I want to get three tokens Text, Date and Hour

[DATA, 2019-00-01, 23]

I tried this

String x = "DATA/2019-00-01-23.x";
System.out.println(Arrays.toString(x.split("/|-[0-9]+.")))

This returns me

[DATA, 2019, 01, x]

Upvotes: 1

Views: 819

Answers (3)

Youcef LAIDANI
Youcef LAIDANI

Reputation: 59988

Solution 1

You can replace the last part after the dot, then using split with /|(\-)(?!.*\-) :

String[] split = "DATA/2019-00-01-23.x".replaceFirst("\\..*$", "")
    .split("/|(\\-)(?!.*\\-)"); // [DATA, 2019-00-01, 23]

Solution 2

I would go with Pattern and Matcher and groups like so (.*?)/(.*?)-([^-]+)\\..* :

Pattern pattern = Pattern.compile("(.*?)/(.*?)-([^-]+)\\..*");
Matcher matcher = pattern.matcher("DATA/2019-00-01-23.x");
if(matcher.find()){
    System.out.println(matcher.group(1)); // DATA
    System.out.println(matcher.group(2)); // 2019-00-01
    System.out.println(matcher.group(3)); // 23
}

Or by using Java9+ you can use :

String[] result = Pattern.compile("(.*?)/(.*?)-([^-]+)\\..*")
        .matcher("DATA/2019-00-01-23.x")
        .results()
        .flatMap(grps -> Stream.of(grps.group(1), grps.group(2), grps.group(3)))
        .toArray(String[]::new);

Outputs

[DATA, 2019-00-01, 23]

Upvotes: 0

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626870

You may actually use a split like

x.split("/|-(?=[^-]*$)|\\D+$")

See the Java demo, output: [DATA, 2019-00-01, 23].

This regex will split at

  • / - a slash
  • | - or
  • -(?=[^-]*$) - last hyphen in the string
  • | - or
  • \D+$ - any 1+ non-digit chars at the end of the string (as String.split(regex) is run with limit argument as 0, these matches at the end of the string do not result in trailing empty items in the resulting array.)

Upvotes: 1

Mark Jeronimus
Mark Jeronimus

Reputation: 9543

Use capturing groups to extract the three parts.

private static final Pattern PATTERN = Pattern.compile("(.+)/([-0-9]+)-([0-9]{1,2})\\..*");

public static void main(String... args) {
    Matcher matcher = PATTERN.matcher("DATA/2019-00-01-23.x");

    if (matcher.matches() && matcher.groupCount() == 3) {
        String text = matcher.group(1);
        String date = matcher.group(2);
        String hour = matcher.group(3);
        System.out.println(text + "\t" + date + '\t' + hour);
    }
}

Dissected: (.+) / ([-0-9]+) - ([0-9]{2}) \..*

  • (.+) Everything before the /
  • ([-0-9]+) Numbers, can contain -
  • - to prevent the previous part from gobbling up the hour
  • ([0-9]{2}) Two numbers
  • \..* A period, then 'the rest'.

Upvotes: 0

Related Questions