parsecer
parsecer

Reputation: 5106

Can't read from binary file - read some lines in UTF-8, some in binary

I have this code:

import java.io.*;
import java.nio.charset.StandardCharsets;

public class Main  {
    public static void main(String[] args) {
        zero("zero.out");
        System.out.println(zeroRead("zero.out"));
    }

    public static String zeroRead(String name)  {

        try (FileInputStream fos = new FileInputStream(name);
             BufferedInputStream bos = new BufferedInputStream(fos);
             DataInputStream dos = new DataInputStream(bos)) {

            StringBuffer inputLine = new StringBuffer();
            String tmp;
            String s = "";
            while ((tmp = dos.readLine()) != null) {
                inputLine.append(tmp);
                System.out.println(tmp);
            }
            dos.close();
            return s;
        }
        catch (IOException e)  {
            e.printStackTrace();
        }

        return null;
    }


    public static void zero(String name)  {
        File file = new File(name);
        String text = "König" + "\t";

        try (FileOutputStream fos = new FileOutputStream(file);
             BufferedOutputStream bos = new BufferedOutputStream(fos);
             DataOutputStream dos = new DataOutputStream(bos)) {

             dos.write(text.getBytes(StandardCharsets.UTF_8));
             dos.writeInt(50);
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
}

zero() method writes data into file: the string is written in UTF-8, while the number is written in binary. zeroRead() read the data from file.

The file looks like this after zero() is executed:

enter image description here

This is what zeroRead() returns:

enter image description here

How do I read the real data König\t50 from the file?

Upvotes: 0

Views: 90

Answers (1)

rzwitserloot
rzwitserloot

Reputation: 103473

DataInputStream's readLine method has javadoc that is almost yelling that it doesn't want to be used. You should heed this javadoc: That method is bad and you should not use it. It doesn't do charset encoding.

Your file format is impossible as stated: You have no idea when to stop reading the string and start reading the binary numbers. However, the way you've described things, it sounds like the string is terminated by a newline, so, the \n character.

There is no easy 'just make this filter-reader and call .nextLine on it available, as they tend to buffer. You can try this:

InputStreamReader isr = new InputStreamReader(bos, StandardCharsets.UTF_8);

However, basic readers do not have a readLine method, and if you wrap this in a BufferedReader, it may read past the end (the 'buffer' in that name is not just there for kicks). You'd have to handroll a method that fetches one character at a time, appending them to a stringbuilder, ending on a newline:

StringBuilder out = new StringBuilder();

for (int c = isr.read(); c != -1 && c != '\n'; c = isr.read())
  out.append((char) c);

String line = out.toString();

will get the job done and won't read 'past' the newline and gobble up your binary number.

Upvotes: 3

Related Questions