X.walt
X.walt

Reputation: 563

Merge two pattern into one

I need write a pattern to remove currency symbol and comma. eg Fr.-145,000.01 After the pattern matcher should return -145000.01.

The pattern i am using:

^[^0-9\\-]*([0-9\\-\\.\\,]*?)[^0-9\\-]*$

This will return -145,000.01

Then I remove the comma to get -145000.01, I want to ask if that's possible that I change the pattern and directly get -145000.01

String pattern = "^[^0-9\\-]*([0-9\\-\\.\\,]*?)[^0-9\\-]*$";
Pattern p = Pattern.compile(pattern);
Matcher m = p.matcher(str);
if(m.matches()) {
 System.out.println(m.group(1));
}

I expect the output could resolve the comma

Upvotes: 2

Views: 151

Answers (5)

The fourth bird
The fourth bird

Reputation: 163217

You could 2 capturing groups and make use of repeating matching using the \G anchor to assert the position at the end of the previous match.

(?:^[^0-9+-]+(?=[.+,\d-]*\.\d+$)([+-]?\d{1,3})|\G(?!^)),(\d{3})

In Java

String regex = "(?:^[^0-9+-]+(?=[.+,\\d-]*\\.\\d+$)([+-]?\\d{1,3})|\\G(?!^)),(\\d{3})";

Explanation

  • (?: Non capturing group
  • ^[^0-9+-]+ Match 1+ times not a digit, + or -
  • (?= Positive lookahead, assert that what follows is:
    • [.+,\d-]*\.\d+$ Match 0+ times what is allowed and assert ending on . and 1+ digits
  • ) Close positive lookahead
  • ( Capturing group 1
    • [+-]?\d{1,3}) Match optional + or - followed by 1-3 digits
    • | Or
    • \G(?!^) Assert position at the end of prevous match, not at the start
  • ), Close capturing group 1 and match ,
  • (\d{3}) Capture in group 2 matching 3 digits

In the replacement use the 2 capturing groups $1$2

See the Regex demo | Java demo

Upvotes: 1

Marc G. Smith
Marc G. Smith

Reputation: 886

You can simply it with String.replaceAll() and simpler regex (providing you are expecting the input to be reasonably sane, i.e. without multiple decimal points embedded in the numbers or multiple negative signs)

   String str = "Fr.-145,000.01";
   str.replaceAll("[^\\d-.]\\.?", "")

If you are going down this route, I would sanity check it by parsing the output with BigDecimal or Double.

Upvotes: 1

RealSkeptic
RealSkeptic

Reputation: 34618

My approach would be to remove all the unnecessary parts using replaceAll.

The unnecessary parts are, apparently:

  1. Any sequence which is not digits or minus at the beginning of the string.
  2. Commas

The first pattern is represented by ^[^\\d-]+. The second is merely ,.

Put them together with an |:

Pattern p = Pattern.compile("(^[^\\d-]+)|,");
Matcher m = p.matcher(str);
String result = m.replaceAll("");

Upvotes: 1

Markus Jarderot
Markus Jarderot

Reputation: 89171

String str = "Fr.-145,000.01";

Pattern regex = Pattern.compile("^[^0-9-]*(-?[0-9]+)(?:,([0-9]{3}))?(?:,([0-9]{3}))?(?:,([0-9]{3}))?(\\.[0-9]+)?[^0-9-]*$");
Matcher matcher = regex.matcher(str);
System.out.println(matcher.replaceAll("$1$2$3$4$5"));

Output:

-145000.01

It looks for number with up to 3 commas (Up to 999,999,999,999.99), and replaces it with the digits.

Upvotes: 1

Emma
Emma

Reputation: 27723

One approach would be to just collect our desired digits, ., + and - in a capturing group followed by an optional comma, and then join them:

([+-]?[0-9][0-9.]+),?

Test

import java.util.regex.Matcher;
import java.util.regex.Pattern;

final String regex = "([+-]?[0-9][0-9.]+),?";
final String string = "Fr.-145,000.01\n"
     + "Fr.-145,000\n"
     + "Fr.-145,000,000\n"
     + "Fr.-145\n"
     + "Fr.+145,000.01\n"
     + "Fr.+145,000\n"
     + "Fr.145,000,000\n"
     + "Fr.145\n"
     + "Fr.145,000,000,000.01";

final Pattern pattern = Pattern.compile(regex, Pattern.MULTILINE);
final Matcher matcher = pattern.matcher(string);

while (matcher.find()) {
    System.out.println("Full match: " + matcher.group(0));
    for (int i = 1; i <= matcher.groupCount(); i++) {
        System.out.println("Group " + i + ": " + matcher.group(i));
    }
}

Demo

Upvotes: 1

Related Questions