Reputation: 8266
I have a requirement to clean a string for illegal barcode-39 data and change each illegal charcter to whitespace. Currently the only valid characters in barcode-39 are 0-9,A-Z,-(dash),.(dot),$(dollar-sign),/(forward-slash),+(plus-sign),%(percent-sign) and a space.
I tried the following regular expression but it seems to only use the not operator in the first group of characters.
barcode = barcode.toUpperCase().replaceAll("[^A-Z0-9\\s\\-\\.\\s\\$/\\+\\%]*"," ");
The code seems to only interpret, If not A to Z then replace with space. How do I make it interpret, if not A-Z and not 0-9 and not dash and not dollar-sign and not forward-slash, and so on, then replace char with a space.
Any help would be great.
Upvotes: 0
Views: 2523
Reputation: 15502
Try changing your pattern string to [^-0-9A-Z.$/+% ]
; this will match a single character that is not in the Code 39 specification. Also, if this is code that will get executed many times, avoid using String.replaceAll()
since your pattern will get compiled for every method call. Instead, use a pre-compiled pattern as follows:
final static Pattern INVALID_CODE39_CHAR = Pattern.compile("[^-0-9A-Z.$/+% ]");
barcode = INVALID_CODE39_CHAR.matcher(barcode.toUpperCase()).replaceAll(" ");
If you want to replace contiguous invalid characters with a single replacement string, add a +
to the end of the pattern. The *
in your original pattern will match zero or more of the characters that are not in your character class; in effect, adding your replacement string, (space), after all characters.
Take a look at the Pattern JavaDoc for more information; also, this is very useful.
Upvotes: 2
Reputation: 285403
Why the "*" at the end? I would think that this isn't needed, and what's more will mess things up for you.
Upvotes: 1