Reputation: 1795
I have a method which accept file and size of chunks and return list of chunked files. But the main problem that my line in file could be broken, for example in main file I have next lines:
|1|aaa|bbb|ccc|
|2|ggg|ddd|eee|
After split I could have in one file:
|1|aaa|bbb
In another file:
|ccc|2|
|ggg|ddd|eee|
Here is the code:
public static List<File> splitFile(File file, int sizeOfFileInMB) throws IOException {
int counter = 1;
List<File> files = new ArrayList<>();
int sizeOfChunk = 1024 * 1024 * sizeOfFileInMB;
byte[] buffer = new byte[sizeOfChunk];
try (BufferedInputStream bis = new BufferedInputStream(new FileInputStream(file))) {
String name = file.getName();
int tmp = 0;
while ((tmp = bis.read(buffer)) > 0) {
File newFile = new File(file.getParent(), name + "."
+ String.format("%03d", counter++));
try (FileOutputStream out = new FileOutputStream(newFile)) {
out.write(buffer, 0, tmp);
}
files.add(newFile);
}
}
return files;
}
Should I use RandomAccessFile class for above purposes (main file is really big - more then 5 Gb)?
Upvotes: 10
Views: 24072
Reputation: 117
Split files in chunks depending upon your chunk size
val f = FileInputStream(file)
val data = ByteArray(f.available()) // Size of original file
var subData: ByteArray
f.read(data)
var start = 0
var end = CHUNK_SIZE
val max = data.size
if (max > 0) {
while (end < max) {
subData = data.copyOfRange(start, end)
start = end
end += CHUNK_SIZE
if (end >= max) {
end = max
}
//Function to upload your chunk
uploadFileInChunk(subData, isLast = false)
}
// For the Last Chunk
end--
subData = data.copyOfRange(start, end)
uploadFileInChunk(subData, isLast = true)
}
If you are taking the file from the user through intent you may get file URI as content, so in that case.
Uri uri = data.getData();
InputStream inputStream = getContext().getContentResolver().openInputStream(uri);
fileInBytes = IOUtils.toByteArray(inputStream);
Add the dependency in you build gradle to use IOUtils
compile 'commons-io:commons-io:2.11.0'
Now do a little modification in the above code to send your file to server.
var subData: ByteArray
var start = 0
var end = CHUNK_SIZE
val max = fileInBytes.size
if (max > 0) {
while (end < max) {
subData = fileInBytes.copyOfRange(start, end)
start = end
end += CHUNK_SIZE
if (end >= max) {
end = max
}
uploadFileInChunk(subData, isLast = false)
}
// For the Last Chunk
end--
subData = fileInBytes.copyOfRange(start, end)
uploadFileInChunk(subData, isLast = true)
}
Upvotes: 0
Reputation: 21
Just in case anyone is interested in a Kotlin version. It creates an iterator of ByteArray chunks:
class ByteArrayReader(val input: InputStream, val chunkSize: Int, val bufferSize: Int = 1024*8): Iterator<ByteArray> {
var eof: Boolean = false
init {
if ((chunkSize % bufferSize) != 0) {
throw RuntimeException("ChunkSize(${chunkSize}) should be a multiple of bufferSize (${bufferSize})")
}
}
override fun hasNext(): Boolean = !eof
override fun next(): ByteArray {
var buffer = ByteArray(bufferSize)
var chunkWriter = ByteArrayOutputStream(chunkSize) // no need to close - implementation is empty
var bytesRead = 0
var offset = 0
while (input.read(buffer).also { bytesRead = it } > 0) {
if (chunkWriter.use { out ->
out.write(buffer, 0, bytesRead)
out.flush()
offset += bytesRead
offset == chunkSize
}) {
return chunkWriter.toByteArray()
}
}
eof = true
return chunkWriter.toByteArray()
}
}
Upvotes: 2
Reputation: 97
Split a file to multiple chunks (in memory operation), here I'm splitting any file to a size of 500kb(500000 bytes) and adding to a list :
public static List<ByteArrayOutputStream> splitFile(File f) {
List<ByteArrayOutputStream> datalist = new ArrayList<>();
try {
int sizeOfFiles = 500000;
byte[] buffer = new byte[sizeOfFiles];
try (FileInputStream fis = new FileInputStream(f); BufferedInputStream bis = new BufferedInputStream(fis)) {
int bytesAmount = 0;
while ((bytesAmount = bis.read(buffer)) > 0) {
try (OutputStream out = new ByteArrayOutputStream()) {
out.write(buffer, 0, bytesAmount);
out.flush();
datalist.add((ByteArrayOutputStream) out);
}
}
}
} catch (Exception e) {
//get the error
}
return datalist;
}
Upvotes: 0
Reputation: 1649
If you don't mind to have chunks of different lengths (<=sizeOfChunk but closest to it) then here is the code:
public static List<File> splitFile(File file, int sizeOfFileInMB) throws IOException {
int counter = 1;
List<File> files = new ArrayList<File>();
int sizeOfChunk = 1024 * 1024 * sizeOfFileInMB;
String eof = System.lineSeparator();
try (BufferedReader br = new BufferedReader(new FileReader(file))) {
String name = file.getName();
String line = br.readLine();
while (line != null) {
File newFile = new File(file.getParent(), name + "."
+ String.format("%03d", counter++));
try (OutputStream out = new BufferedOutputStream(new FileOutputStream(newFile))) {
int fileSize = 0;
while (line != null) {
byte[] bytes = (line + eof).getBytes(Charset.defaultCharset());
if (fileSize + bytes.length > sizeOfChunk)
break;
out.write(bytes);
fileSize += bytes.length;
line = br.readLine();
}
}
files.add(newFile);
}
}
return files;
}
The only problem here is file charset which is default system charset in this example. If you want to be able to change it let me know. I'll add third parameter to "splitFile" function for it.
Upvotes: 13