Prakash
Prakash

Reputation: 139

Regex Remove white spaces from a group

Hi I have the following values

000001010016C02AB  111*
000001010016C02    111H
000001010016C      111 

And the expected output is

00000101001,C02AB,*
00000101001,C02,H
00000101001,C, 

The values might vary.The length of this string will always be 23.if a character is not present then the position will be a filled with a white space. The Regex now i have is

(^.{11})[0-9](.{5})(?:.{5})(.*)

But while using this Regex in the second group there are white spaces returned. I want those those white spaces to be removed.

Current Output:

00000101001,C02AB,*
00000101001,C02  ,H
00000101001,C    , 

Could anyone help me remove the white spaces from the second group?

Demo

Upvotes: 5

Views: 4241

Answers (4)

Wray Zheng
Wray Zheng

Reputation: 997

Using the end assertion $ makes it easy to match:

^(.{11})\d(\w+).+(.)$

Upvotes: 0

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626929

In Java, you may implement a custom replacement logic using Matcher#appendReplacement() and just trim() the matcher.group(2) value:

String strs[]  = {"000001010016C02AB  111*", "000001010016C02    111H", "000001010016C      111 ", "901509010012V      154 "};
Pattern p = Pattern.compile("(.{11})[0-9](.{5}).{5}(.*)");
for (String s: strs) {
    StringBuffer result = new StringBuffer();
    Matcher m = p.matcher(s);
    if (m.matches()) {
            m.appendReplacement(result, m.group(1) + "," + m.group(2).trim()  + "," + m.group(3));
    }
    System.out.println(result.toString());
}

Result:

00000101001,C02AB,*
00000101001,C02,H
00000101001,C, 
90150901001,V, 

See the Java demo.

Note I removed ^ because Matcher#matches() method requires a full string match. Use the Pattern.DOTALL option if the string may contain line breaks.

Upvotes: 1

Toto
Toto

Reputation: 91430

  • Find: ^(.{11})\d(\S+)\s*.{3}(.?)$
  • Replace: $1,$2,$3

Explanation:

^           : beginning of string
  (.{11})   : 11 any character, stored in group 1
  \d        : 1 digit
  (\S+)     : 1 or more non spaces, stored in group 2
  \s*       : 0 or more spaces
  .{3}      : 3 any character
  (.?)      : 0 or 1 character, stored in group 3
$

Result:

00000101001,C02AB,*
00000101001,C02,H
00000101001,C, 

Upvotes: 2

tomersss2
tomersss2

Reputation: 155

In Regex there are capturing groups, just concatenate these 2 groups and you'll have your results, in the concatenation you may insert a comma

 ^(\w+)\s*\d+(\D+)$

A group is what is inside ()

Upvotes: 0

Related Questions