Reputation: 927
In any of the standard libraries is there a definition for characters classes (alpha, numeric, alphanumeric)? I'm checking if a string contains only alphanumeric characters or a colon:
StringUtils.containsOnly(input, ALPHA_NUMERIC + ":");
I could define ALPHA_NUMERIC myself, but it seems common characters classes would be defined in a standard library, although I have been unable to find the definitions.
edit: I did consider regex, but for my particular use case execution time is important, and a simple scan is more efficient.
edit: Here are the test results, using Regex, CharMatcher, and a simple scan (using the same set of valid/invalid input strings for each test):
Valid Input Strings:
CharMatcher, Num Runs: 1000000, Valid Strings: true, Time (ms): 1200
Regex, Num Runs: 1000000, Valid Strings: true, Time (ms): 909
Scan, Num Runs: 1000000, Valid Strings: true, Time (ms): 96
Invalid input strings:
CharMatcher, Num Runs: 1000000, Valid Strings: false, Time (ms): 277
Regex, Num Runs: 1000000, Valid Strings: false, Time (ms): 253
Scan, Num Runs: 1000000, Valid Strings: false, Time (ms): 36
Here is the code that performed the scan:
public boolean matches(String input) {
for(int i=0; i<input.length(); i++) {
char c = input.charAt(i);
if( !Character.isLetterOrDigit(c) && c != ':') {
return false;
}
}
return true;
}
edit: I recompiled as a standalone program (I was running through eclipse):
CharMatcherTester, Num Runs: 1000000, Valid Strings: true, Time (ms): 418
RegexTester, Num Runs: 1000000, Valid Strings: true, Time (ms): 812
ScanTester, Num Runs: 1000000, Valid Strings: true, Time (ms): 88
CharMatcherTester, Num Runs: 1000000, Valid Strings: false, Time (ms): 142
RegexTester, Num Runs: 1000000, Valid Strings: false, Time (ms): 223
ScanTester, Num Runs: 1000000, Valid Strings: false, Time (ms): 32
Source: https://bitbucket.org/jdeveloperw/testing (This is my first time posting test results to SO, so comments are appreciated.)
Upvotes: 2
Views: 363
Reputation: 198471
Guava's CharMatcher
is pretty much exactly what you're asking for. Here is the wiki article. (Disclosure: I contribute to Guava.)
CharMatcher matcher = CharMatcher.JAVA_LETTER_OR_DIGIT.or(
CharMatcher.is(':'));
return matcher.matchesAllOf(string);
Upvotes: 1
Reputation: 236140
Try this, using regular expressions:
boolean containsOnlyAlphanumeric = input.matches("[\\p{Alnum}:]+");
EDIT :
For the best performance you can pre-compile the pattern, store it in a statically defined pattern constant and reuse it whenever necessary:
// part of the class declaration
private static final Pattern ALPHANUMERIC_PLUS_COLON = Pattern.compile("[\\p{Alnum}:]+");
// whenever you need to check if the input matches the pattern
boolean containsOnlyAlphanumeric = ALPHANUMERIC_PLUS_COLON.matcher(input).matches();
I agree with Matthew Flaschen, you should not discard regular expressions right away, a well-built, pre-compiled regex can be as fast if not faster than a scan that checks for all possible valid characters in the input string. Benchmark first!
Upvotes: 2
Reputation: 5587
Well it does exist when you are talking about regex
in which case the character class \w
represents just that. That's why the String class has the matches method.
edit: That StringUtils class probably predates Java 1.4 when the matches method was added. A lot of the functionality that the Apache Commons classes provide have been folded into the standard library. They are still useful for when you have to use old versions of Java or you are using something that isn't in the standard library, but this doesn't seem to be one of the cases.
Upvotes: 2
Reputation: 285047
Your best bet is probably a regex Pattern.
It should match:
[\p{Alnum}:]*
\p{Alnum}
- ASCII alphanumeric[]
- character class (any of the characters inside will match one character):
- literal :*
- 0 or moreif it is all alphanumeric (or :).
You can use matches or pre-compile the regex.
Upvotes: 5
Reputation: 146
Regex matching would do the job. For example MyString.matches("[a-zA-Z0-9:]*");
Upvotes: 0