user2332505
user2332505

Reputation: 649

Regex to split a string using java

I am trying to parse a string as I need to pass the map to UI. Here is my input string :

      "2020-02-01T00:00:00Z",1,
      "2020-04-01T00:00:00Z",4,
      "2020-05-01T00:00:00Z",2,
      "2020-06-01T00:00:00Z",31,
      "2020-07-01T00:00:00Z",60,
      "2020-08-01T00:00:00Z",19,
      "2020-09-01T00:00:00Z",10,
      "2020-10-01T00:00:00Z",33,
      "2020-11-01T00:00:00Z",280,
      "2020-12-01T00:00:00Z",61,
      "2021-01-01T00:00:00Z",122,
      "2021-12-01T00:00:00Z",1

I need to split the string like this :

    "2020-02-01T00:00:00Z",1 : split[0]
    "2020-04-01T00:00:00Z",4 : split[1]

Issue is I can't split it on " , " as its repeated 2 times.

I need a regex that gives 2020-02-01T00:00:00Z,1 as one token to process further.

I am new to regex. Can someone please provide a regex expression for the same.

Upvotes: 0

Views: 88

Answers (3)

Eddie Lopez
Eddie Lopez

Reputation: 1139

Here's your pattern:

final Pattern pattern = Pattern.compile("(\\S+),(\\d+)");
final Matcher matcher = pattern.matcher("Input....");

Here's how to use it:

while (matcher.find()) {
    final String date = matcher.group(1);
    final String number = matcher.group(2);
}

Upvotes: 0

Arvind Kumar Avinash
Arvind Kumar Avinash

Reputation: 79075

If you want the pairs of date-time and ID, you can use the regex, (\"\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}Z\",\d+)(?=,|$) to get the match results.

The pattern, (?=,|$) is the lookahead assertion for comma or end of the line.

Demo:

import java.util.List;
import java.util.regex.MatchResult;
import java.util.regex.Pattern;
import java.util.stream.Collectors;

public class Main {
    public static void main(String[] args) {
        String s = "\"2020-02-01T00:00:00Z\",1,\n"
                + "      \"2020-04-01T00:00:00Z\",4,\n"
                + "      \"2020-05-01T00:00:00Z\",2,\n"
                + "      \"2020-06-01T00:00:00Z\",31,\n"
                + "      \"2020-07-01T00:00:00Z\",60,\n"
                + "      \"2020-08-01T00:00:00Z\",19,\n"
                + "      \"2020-09-01T00:00:00Z\",10,\n"
                + "      \"2020-10-01T00:00:00Z\",33,\n"
                + "      \"2020-11-01T00:00:00Z\",280,\n"
                + "      \"2020-12-01T00:00:00Z\",61,\n"
                + "      \"2021-01-01T00:00:00Z\",122,\n"
                + "      \"2021-12-01T00:00:00Z\",1";
        
        List<String> list = Pattern.compile("(\\\"\\d{4}-\\d{2}-\\d{2}T\\d{2}:\\d{2}:\\d{2}Z\\\",\\d+)(?=,|$)")
                .matcher(s)
                .results()
                .map(MatchResult::group)
                .collect(Collectors.toList());
        
        list.stream()
            .forEach(p -> System.out.println(p));
    }
}

Output:

"2020-02-01T00:00:00Z",1
"2020-04-01T00:00:00Z",4
"2020-05-01T00:00:00Z",2
"2020-06-01T00:00:00Z",31
"2020-07-01T00:00:00Z",60
"2020-08-01T00:00:00Z",19
"2020-09-01T00:00:00Z",10
"2020-10-01T00:00:00Z",33
"2020-11-01T00:00:00Z",280
"2020-12-01T00:00:00Z",61
"2021-01-01T00:00:00Z",122
"2021-12-01T00:00:00Z",1

Upvotes: 4

Mark C
Mark C

Reputation: 129

Why can't you just split on , and ignore the last value?

Upvotes: 0

Related Questions