user2531191
user2531191

Reputation: 579

Why we use byte to read binary data

We read and write binary files using the java primitive 'byte' like fileInputStream.read(byte) etc. In some more example we see byte[] = String.getBytes(). A byte is just 8-bit value. Why we use byte[] to read binaries? What does a byte value contains after reading from file or string ?

Upvotes: 0

Views: 606

Answers (3)

Stephen C
Stephen C

Reputation: 719551

We read and write binary files using the java primitive 'byte' like fileInputStream.read(byte) etc.

Because the operating system models files as sequences of bytes (or more precisely, as octets). The byte type is the most natural representation of an octet in Java.

Why we use byte[] to read binaries?

Same answer as before. Though, in reality, you can also read binary files in other ways as well; e.g. using DataInputStream.

What does a byte value contains after reading from file or string ?

In the first case, the byte that was in the file.

In the second case, you don't "read" bytes from a String. Rather, when you call the String.getBytes() you get the bytes that comprise the String's characters when they are encoded in a particular character-set. If you use the no-args getBytes() method you will get the JVM's default character-set / encoding. You can also supply an argument to choose a different encoding.


Java makes a clear distinction between bytes (8 bit) quantities and characters. Conceptually, Java characters are Unicode code points, and strings and similar representations of text are sequences of characters ... not sequences of bytes.

(Unfortunately, there is a "wrinkle" in the implementation. When Java was designed, the Unicode character space fitted into a 16 bits; i.e. there were <= 65536 recognized code points. Java was designed to match this ... and the char type was defined as a 16 bit unsigned integral type. But then Unicode was expanded to > 65536 code points, and Java was left with the awkward problem that some Unicode code points could not be represented using one char values. Instead, they are represented by a pair of char values ... a so-called surrogate pair ... and Java strings are effectively represented in UTF-16. For most common characters / character-sets, this doesn't matter. But if you need to deal with unusual characters / character-sets, the correct way to deal with Strings is to use the "codepoint" methods.)

Upvotes: 5

The String is built upon bytes. The bytes are built upon bits. The bits are "physically" stored on the drive.

So instead of reading data from drive bit by bit it is read in larger portions which are bytes.

So the byte[] contains raw data. Raw data are equal to that what is stored on drive.

You eventually alaways read raw data, then you can apply a formatter what turns that bytes into characters and eventually into letters dispalyed on the screed if that is a txt file. If you dead with image out will read bytes that store the information about color instaed of character.

Upvotes: 2

immyth
immyth

Reputation: 43

Because the smallest storage unit is byte.

Upvotes: -1

Related Questions