Reputation: 301
I have mainframe data file which is greater than 4GB. I need to read and process the data for every 500 bytes. I have tried using FileChannel, however I am getting error with message Integer.Max_VALUE exceeded
public void getFileContent(String fileName) {
RandomAccessFile aFile = null;
FileChannel inChannel = null;
try {
aFile = new RandomAccessFile(Paths.get(fileName).toFile(), "r");
inChannel = aFile.getChannel();
ByteBuffer buffer = ByteBuffer.allocate(500 * 100000);
while (inChannel.read(buffer) > 0) {
buffer.flip();
for (int i = 0; i < buffer.limit(); i++) {
byte[] data = new byte[500];
buffer.get(data);
processData(new String(data));
buffer.clear();
}
}
} catch (Exception ex) {
// TODO
} finally {
try {
inChannel.close();
aFile.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
Can you help me out with a solution?
Upvotes: 1
Views: 1419
Reputation: 298203
The worst problem of you code is the
catch (Exception ex) {
// TODO
}
part, which implies that you won’t notice any exceptions thrown by your code. Since there is nothing in the JRE printing a “Integer.Max_VALUE exceeded” message, that problem must be connected to your processData
method.
It might be worth noting that this method will be invoked way too often with repeated data.
Your loop
for (int i = 0; i < buffer.limit(); i++) {
implies that you iterate as many times as there are bytes within the buffer, up to 500 * 100000
times. You are extracting 500
bytes from the buffer in each iteration, processing a total of up to 500 * 500 * 100000
bytes after each read
, but since you have a misplaced buffer.clear();
at the end of the loop body, you will never experience a BufferUnderflowException
. Instead, you will invoke processData
each of the up to 500 * 100000
times with the first 500
bytes of the buffer.
But the whole conversion from bytes to a String
is unnecessarily verbose and contains unnecessary copy operations. Instead of implementing this yourself, you can and should just use a Reader
.
Besides that, your code makes a strange detour. It starts with a Java 7 API, Paths.get
, to convert it to a legacy File
object, create a legacy RandomAccessFile
to eventually acquire a FileChannel
. If you have a Path
and want a FileChannel
, you should open it directly via FileChannel.open
. And, of course, use a try(…) { … }
statement to ensure proper closing.
But, as said, if you want to process the contents as String
s, you surely want to use a Reader
instead:
public void getFileContent(String fileName) {
try( Reader reader=Files.newBufferedReader(Paths.get(fileName)) ) {
CharBuffer buffer = CharBuffer.allocate(500 * 100000);
while(reader.read(buffer) > 0) {
buffer.flip();
while(buffer.remaining()>500) {
processData(buffer.slice().limit(500).toString());
buffer.position(buffer.position()+500);
}
buffer.compact();
}
// there might be a remaining chunk of less than 500 characters
if(buffer.position()>0) {
processData(buffer.flip().toString());
}
} catch(Exception ex) {
// the *minimum* to do:
ex.printStackTrace();
// TODO real exception handling
}
}
There is no problem with processing files >4GB, I just tested it with a 8GB file. Note that the code above uses the UTF-8
encoding. If you want to retain the behavior of your original code of using whatever happens to be your system’s default encoding, you may create the Reader
using
Files.newBufferedReader(Paths.get(fileName), Charset.defaultCharset())
instead.
Upvotes: 3