shinzou
shinzou

Reputation: 6192

Match and remove all spaces inside brackets pattern

I want to match all spaces that are inside every [[]] in a string so I could use a replaceAll method and remove them.

Example input: text text [[ ia asd ]] [[asdasd]] dfgd dfaf sddgsd [[sss aaa]]

Expected output: text text [[iaasd]] [[asdasd]] dfgd dfaf sddgsd [[sssaaa]]

I thought of this: \[\[(\s*?)\]\] which should match all spaces that are between double brackets but it doesn't match anything.

I also tried several other solutions to similar problems but non seemed to work.

Any clue what else could be used?

Upvotes: 3

Views: 196

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626861

Considering it is Java, you can use

String result = text.replaceAll("(\\G(?!^)|\\[\\[)((?:(?!]]).)*?)\\s+(?=.*?]])", "$1$2")

Or, another approach is matching all substrings between [[ and ]] and then removing any whitespace inside the matches:

String text = "text text [[ ia asd ]] [[asdasd]] dfgd dfaf sddgsd [[sss aaa]]";
Pattern p = Pattern.compile("\\[\\[.*?]]");
Matcher m = p.matcher(text);
StringBuffer buffer = new StringBuffer();
while(m.find()) {
    m.appendReplacement(buffer, m.group().replaceAll("\\s+", ""));
}
m.appendTail(buffer);
System.out.println(buffer.toString());

See the Java demo online.

The first regex means:

  • (\G(?!^)|\[\[) - Group 1 ($1): either [[ or the end of the preceding successful match
  • ((?:(?!]]).)*?) - Group 2 ($2): any char other than line break chars, zero or more but as few as possible occurrences, that does not start a ]] char sequence
  • \s+ - one or more whitespaces
  • (?=.*?]]) - immediately to the right, there must be any zero or more chars other than line break chars, as few as possible, and then ]].

Upvotes: 2

Related Questions