Adil F
Adil F

Reputation: 447

Trim whitespaces

I am aware of the trim() function for String and i am trying to implement it in my own to better understand regex. The following code does not seem to work in Java. any input ?

private static String secondWay(String input) {
  Pattern pattern = Pattern.compile("^\\s+(.*)(\\s$)+");
  Matcher matcher = pattern.matcher(input);
  String output = null;
  while(matcher.find()) {
    output = matcher.group(1);
    System.out.println("'"+output+"'");
}
return output;
}

The output for

input = "    This is a test    " is 'This is a test   '

I am able to do it using an alternative way like

private static final String start_spaces = "^(\\s)+";
private static final String end_spaces = "(\\s)+$";
private static String oneWay(String input) {
       String output;
       input = input.replaceAll(start_spaces,"");
       output = input.replaceAll(end_spaces,"");
       System.out.println("'"+output+"'");
       return output;
}

The output is accurate as

'This is a test'

I want to modify my first method to run correctly and return the result.

Any help is appreciated. Thank you :)

Upvotes: 2

Views: 231

Answers (2)

Michael Yaworski
Michael Yaworski

Reputation: 13483

I realize you are using a Pattern and Matcher, but this is the easiest way to do it:

private static String secondWay(String input) {
    String pattern = "^\\s+|\\s+$"; // notice it's a string
    return input.replaceAll(pattern, ""); 
}

The regex is ^\\s+|\\s+$ which matches:

  • all starting whitespace (^ means start and \\s+ means whitespace)
  • or (| means or)
  • all ending whitespace ($ means end of line)

Upvotes: 3

hwnd
hwnd

Reputation: 70732

Your pattern is incorrect, it matches the beginning whitespace, your input (greedy) matching until the last whitespace and then it captures the last whitespace at the end of the string.

You want the following instead, following .* with ? as well for a non-greedy match.

Pattern pattern = Pattern.compile("^\\s+(.*?)\\s+$");

Regular expression:

^              # the beginning of the string
\s+            # whitespace (\n, \r, \t, \f, and " ") (1 or more times)
(              # group and capture to \1:
 .*?           # any character except \n (0 or more times)
)              # end of \1
\s+            # whitespace (\n, \r, \t, \f, and " ") (1 or more times)
$              # before an optional \n, and the end of the string

See Demo

EDIT: If you want to capture the leading and trailing whitespace into groups, just place a capturing group () around them as well.

Pattern pattern = Pattern.compile("^(\\s+)(.*?)(\\s+)$");
  • Group 1 contains leading whitespace
  • Group 2 contains your matched text
  • Group 3 contains trailing whitespace

FYI, for replacing the leading/trailing whitespace you can achieve this in one line.

input.replaceAll("^\\s+|\\s+$", "");

Upvotes: 3

Related Questions