Reputation: 23
Problem
I am trying to encode file contents of doc/pdf extensions to Base64 string in Java. The encoded string length almost doubles from the original(115k -> 230k). Whereas encoding the same file contents in Python/PHP or any online tool only gives a third increase(115k -> 154k).
What causes this increase in size for Java and is there any way to get equivalent result as the other sources?
Code
import java.util.Base64;
...
//String content;
System.out.println(content.length());
String encodedStr = new String(Base64.getEncoder().encode(content.getBytes()));
System.out.println(encodedStr.length());
String urlEncodedStr = new String(Base64.getUrlEncoder().encode(content.getBytes()));
System.out.println(urlEncodedStr.length());
String mimieEncodedStr = new String(Base64.getMimeEncoder().encode(content.getBytes()));
System.out.println(mimieEncodedStr.length());
Output
For pdf file
115747 230816 230816 236890
For doc file
13685 26392 26392 27086
Upvotes: 0
Views: 1099
Reputation: 201447
First, never use new String
. Second, pass an encoding to String.getBytes(String)
(e.g. content.getBytes(encoding)
). For example,
String encodedStr = Base64.getEncoder()
.encodeToString(content.getBytes("UTF-8"));
or
String encodedStr = Base64.getEncoder()
.encodeToString(content.getBytes("US-ASCII"));
Upvotes: 1