Reputation: 1178
As of Java 1.7, StandardCharsets are part of the standard library, but I work with a lot of legacy code which was written well before that was implemented. I have been replacing stuff with StandardCharsets whenever I run across it (primarily to make the code prettier/cleaner), but I have worries about making these changes in areas which have performance-critical sections or that I can't easily debug.
Is there any technical reason for not using Standard Charsets? As in, are there 'gotchas' or inefficiencies that might arise from using StandardCharsets instead of Guava charsets or something like getBytes("UTF-8")? I know that "These charsets are guaranteed to be available on every implementation of the Java platform.", but I don't know if they're slower or have quirks that the older methods don't have.
To try and keep this on-topic, assume that there's no subjective force affecting this like the preference of other developers, resistance to change, etc.
Also, if it affects anything, UTF-8 is the encoding I really care about.
Upvotes: 3
Views: 4109
Reputation: 36339
You should use them, if only for the reason that you can't get an UnsupportedCharSetException, which is the case if you use the forName methods and misspell the name.
It is always a good idea to "move" the possibility of an error from runtime to compile time.
Upvotes: 2
Reputation: 4251
As in, are there 'gotchas' or inefficiencies that might arise from using StandardCharsets instead of Guava charsets or something like getBytes("UTF-8")?
First of all, java.nio.charset.StandardCharsets.UTF_8
(as implemented in OpenJDK/Oracle JDK), com.google.common.base.Charsets.UTF_8
and org.apache.commons.io.Charsets.UTF_8
are all implemented exactly identically:
public static final Charset UTF_8 = Charset.forName("UTF-8");
So, at least, you don't have to worry about differences with Guava Charsets or with Charset.forName("UTF-8")
.
As for String.getBytes(String)
and String.getBytes(Charset)
, I do see a difference in the documentation:
String.getBytes(Charset)
: "This method always replaces malformed-input and unmappable-character sequences with this charset's default replacement byte array.".String.getBytes(String)
: "The behavior of this method when this string cannot be encoded in the given charset is unspecified.".So, depending on which JRE you use, I expect there might be a difference in the handling of unmappable characters between someString.getBytes("UTF-8")
and someString.getBytes(StandardCharsets.UTF_8)
.
Upvotes: 4
Reputation: 101
The best reason to not use StandardCharsets would probably be the use of special characters. Not every character has been available since Java 1 and therefore it's likely that although this is the best for legacy programs, it's not universally accessible and useful to everyone.
Then again, it's probably fine for most people - and I can't imagine any performance issues here resulting.
Upvotes: 0