Reputation: 536
I have a string like this
DATA/2019-00-01-23.x
I want to get three tokens Text, Date and Hour
[DATA, 2019-00-01, 23]
I tried this
String x = "DATA/2019-00-01-23.x";
System.out.println(Arrays.toString(x.split("/|-[0-9]+.")))
This returns me
[DATA, 2019, 01, x]
Upvotes: 1
Views: 819
Reputation: 59988
You can replace the last part after the dot, then using split with /|(\-)(?!.*\-)
:
String[] split = "DATA/2019-00-01-23.x".replaceFirst("\\..*$", "")
.split("/|(\\-)(?!.*\\-)"); // [DATA, 2019-00-01, 23]
I would go with Pattern
and Matcher
and groups like so (.*?)/(.*?)-([^-]+)\\..*
:
Pattern pattern = Pattern.compile("(.*?)/(.*?)-([^-]+)\\..*");
Matcher matcher = pattern.matcher("DATA/2019-00-01-23.x");
if(matcher.find()){
System.out.println(matcher.group(1)); // DATA
System.out.println(matcher.group(2)); // 2019-00-01
System.out.println(matcher.group(3)); // 23
}
Or by using Java9+ you can use :
String[] result = Pattern.compile("(.*?)/(.*?)-([^-]+)\\..*")
.matcher("DATA/2019-00-01-23.x")
.results()
.flatMap(grps -> Stream.of(grps.group(1), grps.group(2), grps.group(3)))
.toArray(String[]::new);
Outputs
[DATA, 2019-00-01, 23]
Upvotes: 0
Reputation: 626870
You may actually use a split
like
x.split("/|-(?=[^-]*$)|\\D+$")
See the Java demo, output: [DATA, 2019-00-01, 23]
.
This regex will split at
/
- a slash |
- or-(?=[^-]*$)
- last hyphen in the string|
- or\D+$
- any 1+ non-digit chars at the end of the string (as String.split(regex)
is run with limit
argument as 0
, these matches at the end of the string do not result in trailing empty items in the resulting array.)Upvotes: 1
Reputation: 9543
Use capturing groups to extract the three parts.
private static final Pattern PATTERN = Pattern.compile("(.+)/([-0-9]+)-([0-9]{1,2})\\..*");
public static void main(String... args) {
Matcher matcher = PATTERN.matcher("DATA/2019-00-01-23.x");
if (matcher.matches() && matcher.groupCount() == 3) {
String text = matcher.group(1);
String date = matcher.group(2);
String hour = matcher.group(3);
System.out.println(text + "\t" + date + '\t' + hour);
}
}
Dissected: (.+)
/
([-0-9]+)
-
([0-9]{2})
\..*
(.+)
Everything before the /
([-0-9]+)
Numbers, can contain -
-
to prevent the previous part from gobbling up the hour([0-9]{2})
Two numbers\..*
A period, then 'the rest'.Upvotes: 0