Reputation: 425003
Supposedly, it is "best practice" to specify the encoding when creating a String
from a byte[]
:
byte[] b;
String a = new String(b, "UTF-8"); // 100% safe
String b = new String(b); // safe enough
If I know my installation has default encoding of utf8, is it really necessary to specify the encoding to still be "best practice"?
Upvotes: 1
Views: 115
Reputation: 718788
If I know my installation has default encoding of utf8, is it really necessary to specify the encoding to still be "best practice"?
But do you know for sure that your installation will always have a default encoding of UTF-8? (Or at least, for as long as your code is used ...)
And do you know for sure that your code is never going to be used in a different installation that has a different default encoding?
If the answer to either of those is "No" (and unless you are prescient, it probably has to be "No") then I think that you should follow best practice ... and specify the encoding if that is what your application semantics requires:
If the requirement is to always encode (or decode) in UTF-8, then use "UTF-8"
.
If the requirement is to always encode (or decode) in using the platform default, then do that.
If the requirement is to support multiple encodings (or the requirement might change) then make the encoding name a configuration (or command line) parameter, resolve to a Charset
object and use that.
The point of this "best practice" recommendation is to avoid a foreseeable problem that will arise if your platform's characteristics change. You don't think that is likely, but you probably can't be completely sure about it. But at the end of the day, it is your decision.
(The fact that you are actually thinking about whether "best practice" is appropriate to your situation is a GOOD THING ... in my opinion.)
Upvotes: 1
Reputation: 43728
Different use cases have to be distinguished here: If you get the bytes from an external source via some protocol with a specified encoding then always use the first form (with explicit encoding).
If the source of the bytes is the local machine, for example a local text file, the second form (without explicit encoding) is better.
Always keep in mind, that your program may be used on a different machine with a different platform encoding. It should work there without any changes.
Upvotes: 3