Bharath Reddy
Bharath Reddy

Reputation: 381

Parse string using Java Regex Pattern?

I have the below java string in the below format.

String s = "City: [name:NYK][distance:1100] [name:CLT][distance:2300] [name:KTY][distance:3540] Price:"

Using the java.util.regex package matter and pattern classes I have to get the output string int the following format:

Output: [NYK:1100][CLT:2300][KTY:3540]

Can you suggest a RegEx pattern which can help me get the above output format?

Upvotes: 10

Views: 36426

Answers (2)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 627607

If the format of the string is fixed, and you always have just 3 [...] groups inside to deal with, you may define a block that matches [name:...] and captures the 2 parts into separate groups and use a quite simple code with .replaceAll:

String s = "City: [name:NYK][distance:1100] [name:CLT][distance:2300] [name:KTY][distance:3540] Price:";
String matchingBlock = "\\s*\\[name:([A-Z]+)]\\[distance:(\\d+)]";
String res = s.replaceAll(String.format(".*%1$s%1$s%1$s.*", matchingBlock), 
    "[$1:$2][$3:$4][$5:$6]");
System.out.println(res); // [NYK:1100][CLT:2300][KTY:3540]

See the Java demo and a regex demo.

The block pattern matches:

  • \\s* - 0+ whitespaces
  • \\[name: - a literal [name: substring
  • ([A-Z]+) - Group n capturing 1 or more uppercase ASCII chars (\\w+ can also be used)
  • ]\\[distance: - a literal ][distance: substring
  • (\\d+) - Group m capturing 1 or more digits
  • ] - a ] symbol.

In the .*%1$s%1$s%1$s.* pattern, the groups will have 1 to 6 IDs (referred to with $1 - $6 backreferences from the replacement pattern) and the leading and final .* will remove start and end of the string (add (?s) at the start of the pattern if the string can contain line breaks).

Upvotes: 4

Youcef LAIDANI
Youcef LAIDANI

Reputation: 60046

You can use this regex \[name:([A-Z]+)\]\[distance:(\d+)\] with Pattern like this :

String regex = "\\[name:([A-Z]+)\\]\\[distance:(\\d+)\\]";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(s);

StringBuilder result = new StringBuilder();
while (matcher.find()) {                                                
    result.append("[");
    result.append(matcher.group(1));
    result.append(":");
    result.append(matcher.group(2));
    result.append("]");
}

System.out.println(result.toString());

Output

[NYK:1100][CLT:2300][KTY:3540]
  • regex demo
  • \[name:([A-Z]+)\]\[distance:(\d+)\] mean get two groups one the upper letters after the \[name:([A-Z]+)\] the second get the number after \[distance:(\d+)\]

Another solution from @tradeJmark you can use this regex :

String regex = "\\[name:(?<name>[A-Z]+)\\]\\[distance:(?<distance>\\d+)\\]";

So you can easily get the results of each group by the name of group instead of the index like this :

while (matcher.find()) {                                                
    result.append("[");
    result.append(matcher.group("name"));
    //----------------------------^^
    result.append(":");
    result.append(matcher.group("distance"));
    //------------------------------^^
    result.append("]");
}

Upvotes: 19

Related Questions