emilly
emilly

Reputation: 10530

Regex with group captures?

My input string can be in below forms

   "cust_100dept_200_address_300";
   "cust_100_dept_200_address_300";
   "dept_200_address_300cust_100";
   "address_300cust_100dept_200";

basically there are three attributes i.e cust,dept and address each followed by underscore and some digits. There sequence is flexible as i shown in my example where cust_100 can come in beginng or middle or last.

I want the digit(i.e after underscore) for each attribute . So my expected output(whatever order of input atrributes are) is

   group1 = 100
   group2 = 200
   group3 = 300

I tried below

  Pattern p = Pattern.compile(
                "cust_(\\d+)" +
                        "dept_(\\d+)" + "address_(\\d+)");
    Matcher m = p.matcher(input);// where input can be anything i stated in the beginning
    if (m.find()) {
      System.out.println("inside while");
      System.out.println("group1 = " + m.group(1));
      System.out.println("group2" + m.group(2));
      System.out.println("group2" + m.group(3));
    }


But i am not getting desired output?

Upvotes: 0

Views: 228

Answers (5)

Evgeniy Dorofeev
Evgeniy Dorofeev

Reputation: 136002

I would do it differently

    String g1 = s.replaceAll(".*cust_(\\d+).*", "$1");
    String g2 = s.replaceAll(".*dept_(\\d+).*", "$1");
    String g3 = s.replaceAll(".*address_(\\d+).*", "$1");

Upvotes: 2

anubhava
anubhava

Reputation: 785098

Rather than building 3 patterns and 3 matchers I believe you will be better off having just 1 generic pattern. Consider following code:

String str = "cust_100_dept_200_address_300"; // your input
Pattern p = Pattern.compile("(?i)(cust|dept|address)_(\\d+)");
Matcherm = p.matcher(str);
Map<String, String> cMap = new HashMap<String, String>();
while (m.find()) {
   cMap.put(m.group(1).toLowerCase(), m.group(2));
}
System.out.printf("cMap: %s%n", cMap);

output in all the cases wil be:

cMap: {address=300, dept=200, cust=100}

Upvotes: 0

Anirudha
Anirudha

Reputation: 32797

You can do this

String customer=Pattern.compile("(?<=cust_)\\d+").matcher(input).group(0);
String department=Pattern.compile("(?<=dept_)\\d+").matcher(input).group(0);
String address=Pattern.compile("(?<=address_)\\d+").matcher(input).group(0);

Upvotes: 1

Dracs
Dracs

Reputation: 457

Your current regex won't correctly match the first 2 examples due to the underscores between the sections. There is also a separate issue with the second two examples where you have the sections in different order for the examples.

Your best bet would be to run the three different parts of your regex separately as three different expressions. This will allow them to extract the details regardless of the order.

The following is another alternative which will match more generally. This will allow any name/value combination. The first group will be one entire section (E.g. "cust_100") the second group would be "cust" and the third group would be "100".

((\w+)_(\d+)_?)+

Regular expression image

Edit live on Debuggex

Upvotes: 2

AMADANON Inc.
AMADANON Inc.

Reputation: 5919

combining strings with + is just the same as running them together.

I would suggest doing this as 3 different regexes, one for each pattern.

Upvotes: 0

Related Questions