Reputation: 780
In Java, parsing a ZIP archive using a specified charset can be done by using the ZipFile(File, Charset)
constructor for instance.
JarFile (in the util package) inherits from ZipFile, but does not offer ways to use a charset other than UTF-8. I need to parse Jar files that contain strings not encoded with UTF-8. What would be the cleanest workaround to do this?
(I have thought of using reflection to modify the private field ZipFile.zc
right after JarFile() constructor returns, but this solution is not robust and Oracle-specific.)
Upvotes: 2
Views: 638
Reputation: 42585
The Charset parameter is according to the documentation only used "to decode the ZIP entry name and comment". Therefore it is totally irrelevant for you. When you read a file from a ZipFile or Jar you are getting an InputStream with is agnostic regarding the used charset.
Therefore you have to apply the correct charset when converting the byte array based InputStream to a chaaracter based reader, e.g. by using an InputStreamReader.
Edit:
In case we are talking about the file-names in the ZIP file you should be able to create a parallel ZipFile instance on the same file. Use JarFile.getName()
for reading out the jar file path.
Upvotes: 1