Reputation: 139
Hi I have the following values
000001010016C02AB 111*
000001010016C02 111H
000001010016C 111
And the expected output is
00000101001,C02AB,*
00000101001,C02,H
00000101001,C,
The values might vary.The length of this string will always be 23.if a character is not present then the position will be a filled with a white space. The Regex now i have is
(^.{11})[0-9](.{5})(?:.{5})(.*)
But while using this Regex in the second group there are white spaces returned. I want those those white spaces to be removed.
Current Output:
00000101001,C02AB,*
00000101001,C02 ,H
00000101001,C ,
Could anyone help me remove the white spaces from the second group?
Upvotes: 5
Views: 4241
Reputation: 997
Using the end assertion $
makes it easy to match:
^(.{11})\d(\w+).+(.)$
Upvotes: 0
Reputation: 626929
In Java, you may implement a custom replacement logic using Matcher#appendReplacement()
and just trim()
the matcher.group(2)
value:
String strs[] = {"000001010016C02AB 111*", "000001010016C02 111H", "000001010016C 111 ", "901509010012V 154 "};
Pattern p = Pattern.compile("(.{11})[0-9](.{5}).{5}(.*)");
for (String s: strs) {
StringBuffer result = new StringBuffer();
Matcher m = p.matcher(s);
if (m.matches()) {
m.appendReplacement(result, m.group(1) + "," + m.group(2).trim() + "," + m.group(3));
}
System.out.println(result.toString());
}
Result:
00000101001,C02AB,*
00000101001,C02,H
00000101001,C,
90150901001,V,
See the Java demo.
Note I removed ^
because Matcher#matches()
method requires a full string match. Use the Pattern.DOTALL
option if the string may contain line breaks.
Upvotes: 1
Reputation: 91430
^(.{11})\d(\S+)\s*.{3}(.?)$
$1,$2,$3
Explanation:
^ : beginning of string
(.{11}) : 11 any character, stored in group 1
\d : 1 digit
(\S+) : 1 or more non spaces, stored in group 2
\s* : 0 or more spaces
.{3} : 3 any character
(.?) : 0 or 1 character, stored in group 3
$
Result:
00000101001,C02AB,*
00000101001,C02,H
00000101001,C,
Upvotes: 2
Reputation: 155
In Regex there are capturing groups, just concatenate these 2 groups and you'll have your results, in the concatenation you may insert a comma
^(\w+)\s*\d+(\D+)$
A group is what is inside ()
Upvotes: 0