Reputation: 71
I have to read a file called test.p2b with the following content:
I tried reading it like this:
static void branjeIzDatoteke(String location){
byte[] allBytes = new byte[10000];
try {
InputStream input = new FileInputStream(location);
int byteRead;
int j=0;
while ((byteRead = input.read())!=-1){
allBytes[j] = (byte)input.read();
}
String str = new String(allBytes,"UTF-8");
for (int i=0;i<=str.length()-8;i+=8){
//int charCode = Integer.parseInt(str.substring(i,i+8),2);
//System.out.println((char)charCode);
int drek = (int)str.charAt(i);
System.out.println(Integer.toBinaryString(drek));
}
} catch (IOException ex) {
Logger.getLogger(Slika.class.getName()).log(Level.SEVERE, null, ex);
}
}
I tried just printing out the string (when I created String str = new String(allBytes,"UTF-8");), but all I get is a square at the beginning and then 70+ blank lines with no text. Then I tried the int charCode = Integer.parseInt(str.substring(i,i+8),2); and printing out each individual character, but then I got a NumberFormatException. I even tried just converting Finally I tried the Integer.toBinaryString I have at the end but in this case I get 1s and 0s. That's not what I want, I need to read the actual text but no method seems to work. I've actually read a binary file before using the method I already tried: int charCode = Integer.parseInt(str.substring(i,i+8),2); System.out.println((char)charCode); but like I said, I get a NumberFormatException. I don't understand why these methods won't work.
Upvotes: 0
Views: 386
Reputation: 44952
If you want to read all the bytes you can use the java.nio.file.Files
utility class:
Path path = Paths.get("test.p2b");
byte[] allBytes = Files.readAllBytes(path);
String str = new String(allBytes, "UTF-8");
System.out.print(str);
You iteration over str
content might not work. Certain UTF characters are expressed as surrogate pairs, a code points that can span more than one char
(as explained here). Since you are using UTF you should be using String#codePoinst()
method to iterate over the code points instead of the characters.
Upvotes: 1