Reputation: 10530
My input string can be in below forms
"cust_100dept_200_address_300";
"cust_100_dept_200_address_300";
"dept_200_address_300cust_100";
"address_300cust_100dept_200";
basically there are three attributes i.e cust,dept and address each followed by underscore and some digits. There sequence is flexible as i shown in my example where cust_100 can come in beginng or middle or last.
I want the digit(i.e after underscore) for each attribute . So my expected output(whatever order of input atrributes are) is
group1 = 100
group2 = 200
group3 = 300
I tried below
Pattern p = Pattern.compile(
"cust_(\\d+)" +
"dept_(\\d+)" + "address_(\\d+)");
Matcher m = p.matcher(input);// where input can be anything i stated in the beginning
if (m.find()) {
System.out.println("inside while");
System.out.println("group1 = " + m.group(1));
System.out.println("group2" + m.group(2));
System.out.println("group2" + m.group(3));
}
But i am not getting desired output?
Upvotes: 0
Views: 228
Reputation: 136002
I would do it differently
String g1 = s.replaceAll(".*cust_(\\d+).*", "$1");
String g2 = s.replaceAll(".*dept_(\\d+).*", "$1");
String g3 = s.replaceAll(".*address_(\\d+).*", "$1");
Upvotes: 2
Reputation: 785098
Rather than building 3 patterns and 3 matchers I believe you will be better off having just 1 generic pattern. Consider following code:
String str = "cust_100_dept_200_address_300"; // your input
Pattern p = Pattern.compile("(?i)(cust|dept|address)_(\\d+)");
Matcherm = p.matcher(str);
Map<String, String> cMap = new HashMap<String, String>();
while (m.find()) {
cMap.put(m.group(1).toLowerCase(), m.group(2));
}
System.out.printf("cMap: %s%n", cMap);
output in all the cases wil be:
cMap: {address=300, dept=200, cust=100}
Upvotes: 0
Reputation: 32797
You can do this
String customer=Pattern.compile("(?<=cust_)\\d+").matcher(input).group(0);
String department=Pattern.compile("(?<=dept_)\\d+").matcher(input).group(0);
String address=Pattern.compile("(?<=address_)\\d+").matcher(input).group(0);
Upvotes: 1
Reputation: 457
Your current regex won't correctly match the first 2 examples due to the underscores between the sections. There is also a separate issue with the second two examples where you have the sections in different order for the examples.
Your best bet would be to run the three different parts of your regex separately as three different expressions. This will allow them to extract the details regardless of the order.
The following is another alternative which will match more generally. This will allow any name/value combination. The first group will be one entire section (E.g. "cust_100") the second group would be "cust" and the third group would be "100".
((\w+)_(\d+)_?)+
Upvotes: 2
Reputation: 5919
combining strings with +
is just the same as running them together.
I would suggest doing this as 3 different regexes, one for each pattern.
Upvotes: 0