Reputation: 13
I have a binary file on hadoop distributed file system that i want to read . I am using FSDataInputStream ( which extends DataInputStream ) . I have buffer of length "len" . I use readBytes = stream.read(buffer) method to read "len" number of bytes from file into buffer. BUT Actual number of bytes read ( readBytes ) are less than buffer size ( len ), even though I know that there are "len" number of bytes present in file. So why does FSDataInputStream read less number of bytes than i ask it to read? Any IDEA?
Upvotes: 1
Views: 2510
Reputation: 125
If you are positioned near the end of a block of the file, such that "len" bytes froward from that position is somewhere in the next block, then when you stream.read(buffer) you will get only the bytes remaining in the block. On the subsequent read you will start getting the bytes from the next block of the file.
Upvotes: 0
Reputation: 139931
The JavaDocs for DataInputStream.read(byte[]) and InputStream(byte[])
state pretty clearly that the method will read "some number of bytes" up to the length of the byte array. There are several reasons why the code might return before the byte array is filled.
You shouldn't be calling the read(byte[])
method just once to consume bytes from a stream - you need to loop and continue reading from the stream until it returns -1
.
Upvotes: 5