Kyle
Kyle

Reputation: 4288

Apache commons base64 decode and Sun base64 decode

byte[] commonsDecode = Base64.decodeBase64(data);
debug("The data is " + commonsDecode.length + " bytes long for the apache commons base64 decoder.");
BASE64Decoder decoder = new BASE64Decoder();
byte[] sunDecode = decoder.decodeBuffer(data);
Log.debug("The data is " + sunDecode.length + " bytes long for the SUN base64 decoder.");

Please explain to me why these two method calls would produce different length for the resulting byte arrays. I initially thought it might have to do with character encodings but if so I don't understand all of the issues properly. The above code was executed on the same system and in the same application, in the order shown above. So the default character encoding on that system would be the same.

The input (test) data: The below is a System.out.println of the Java String.

qFkIQgDq jk3ScHpqx8BPVS97YE4pP/nBl5Qw7mBnpSGqNqSdGIkLPVod0pBl Uz7NgpizHDicGzNCaauefAdwGklpPr0YdwCu4wRkwyAuvtDmL0BYASOn2tDw72LMz5FChtSa0CoCBQ2ARsFG2GdflnIWsUuBQapX73ZBMiqqm  ZCOnMRv9Ol8zT1TECddlKZMYAvmjANgq0sBPyUMF7co XY9BYAjV3L/cA8CGQpXGdrsAgjPKMhzk4hh1GAoQ1soX2Dva8p3erPJ4sy2Vcb6lS1Hap9FR0AZFawbJ10FFSTg10wxc24539kYA6xxq/TFqkhaEoSyTqjXjvo1SA==

Apache commons decoder says it's 252 length byte array. Java Sun decoder says 256.

Upvotes: 0

Views: 2211

Answers (1)

Codo
Codo

Reputation: 78825

The decoded data is not valid Base64 data.

Valid Base64 data can contain whitespace. Usually, it has a newline every 72 characters. However, your data contains spaces in random places. If they are removed (as every Base64 decoder is supposed to do), 339 characters remain. Yet, valid Base64 data has to be a multiple of 4 characters.

Interestingly, your data contains no plus signs. I suspect it once contained them but they have probably been replaced with spaces somewhere in transmission. If you replace all spaces with plus signs, the Base64 data is valid and the decoded data will have a length of 256 bytes: 344 characters / 4 * 3 - 2 padding characters.

I further suspect that the Base64 data was used in a URL without proper URL encoding. That's a probable cause for the missing plus signs. Note that Base64 encoded data is not URL safe. Both the plus and the equal signs need to be escaped.

Upvotes: 2

Related Questions