Unnamed user
Unnamed user

Reputation: 181

Regex pattern to separate values using comma but retain commas used within parenthesis

I am trying to modify a regex expression so that it retains commas used within parenthesis and separate all other values.

Existing pattern : ([^\\s,]+)\\s*=>([^,]+) Updated pattern : ([^\\s,]+)\\s*=>([^(,)]+)

Java code:


    public static void main(String[] args) {

        String softParms = "batch_code => 'batchCd',user_id => 'SYSUSER',thread_pool => 'tpName',business_date => FN_DATE_ARG(null,0),rerun_number => 0,max_timeout_mins => 0,raise_error => false,thread_notifications => false";

         //Pattern paramPattern = Pattern.compile("([^\\s,]+)\\s*=>([^,]+)");
        Pattern paramPattern = Pattern.compile("([^\\s,]+)\\s*=>([^(,)]+)");
        Matcher matcher = paramPattern.matcher(softParms);
        while (matcher.find()) {
            String param = matcher.group(1);
            String value = matcher.group(2);
            System.out.println("Param: " + param + ", Value: " + value);
        }
    }

The param value for business_date should come as FN_DATE_ARG(null,0) but the function is either returning FN_DATE_ARG(null or FN_RMB_DATE_ARG

Would appreciate any help on this!

Upvotes: 2

Views: 70

Answers (2)

Tim Biegeleisen
Tim Biegeleisen

Reputation: 520968

Why use an overly complex regular expression when all that is needed is a single one-line call to String#replaceAll:

String softParms = "batch_code => 'batchCd',user_id => 'SYSUSER',thread_pool => 'tpName',business_date => FN_DATE_ARG(null,0),rerun_number => 0,max_timeout_mins => 0,raise_error => false,thread_notifications => false";
String businessDate = softParms.replaceAll(".*\\bbusiness_date => (.*?)\\s*(?:,[^,\\s]+ =>.*|$)", "$1");
System.out.println(businessDate);

This prints:

FN_DATE_ARG(null,0)

The regex pattern will match the key business_date followed by \\s*,[^,\\s]+ =>, which in this case will match the text FN_DATE_ARG(null,0). The (.*?) matching group will stop matching at the comma before the next key.

Upvotes: 2

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626738

You can use

([^\s,]+)\s*=>\s*(.*?)(?=\s*,\s*\w+\s*=>|$)

See the regex demo. Details:

  • ([^\s,]+) - Group 1: one or more chars other than whitespace and a comma
  • \s*=>\s* - => enclosed with zero or more whitespaces
  • (.*?) - Group 2: any zero or more chars other than line break chars as few as possible
  • (?=\s*,\s*\w+\s*=>|$) - up to the leftmost sequence of 0+ whitespaces, comma, 0+ whitespaces, 1+ word chars, 0+ whitespaces, =>, or end of string.

In your code, use

Pattern paramPattern = Pattern.compile("([^\\s,]+)\\s*=>\\s*(.*?)(?=\\s*,\\s*\\w+\\s*=>|$)");

See the Java demo online.

Upvotes: 2

Related Questions