sagar pawar
sagar pawar

Reputation: 55

Chunking algorithm for any type of data

I want to chunk large files of any type (audio, video, image...) into small ones. I tried many algorithms but I'm unable to do this. Can any one suggest me a working algorithm?

Upvotes: 1

Views: 3937

Answers (2)

coderAJ
coderAJ

Reputation: 330

You can't read big chunk of files in one go, even if we have such a memory. Basically for each split you can read a fix size byte-array which you know should be feasible in terms of performance as well memory.

 public static void main(String[] args) throws Exception
        {
            RandomAccessFile raf = new RandomAccessFile("test.csv", "r");
            long numSplits = 10; //from user input, extract it from args
            long sourceSize = raf.length();
            long bytesPerSplit = sourceSize/numSplits ;
            long remainingBytes = sourceSize % numSplits;

            int maxReadBufferSize = 8 * 1024; //8KB
            for(int destIx=1; destIx <= numSplits; destIx++) {
                BufferedOutputStream bw = new BufferedOutputStream(new FileOutputStream("split."+destIx));
                if(bytesPerSplit > maxReadBufferSize) {
                    long numReads = bytesPerSplit/maxReadBufferSize;
                    long numRemainingRead = bytesPerSplit % maxReadBufferSize;
                    for(int i=0; i<numReads; i++) {
                        readWrite(raf, bw, maxReadBufferSize);
                    }
                    if(numRemainingRead > 0) {
                        readWrite(raf, bw, numRemainingRead);
                    }
                }else {
                    readWrite(raf, bw, bytesPerSplit);
                }
                bw.close();
            }
            if(remainingBytes > 0) {
                BufferedOutputStream bw = new BufferedOutputStream(new FileOutputStream("split."+(numSplits+1)));
                readWrite(raf, bw, remainingBytes);
                bw.close();
            }
                raf.close();
        }

        static void readWrite(RandomAccessFile raf, BufferedOutputStream bw, long numBytes) throws IOException {
            byte[] buf = new byte[(int) numBytes];
            int val = raf.read(buf);
            if(val != -1) {
                bw.write(buf);
            }
        }

You should also look for some discussions on various sites like https://coderanch.com/t/458202/java/Approach-split-file-chunks and on other sites. Happy coding.

Upvotes: 1

MBo
MBo

Reputation: 80232

Just copy chunks into small files using next start positions:

 N = FileSize / ChunkSize  //integer division
 RestSize = FileSize % ChunkSize  //integer modulo
 for i = 0 to N - 1
     Copy ChunkSize bytes from position i * ChunkSize into ChunkFile[i]
 if RestSize > 0
     Copy RestSize bytes from position N * ChunkSize into ChunkFile[N]

Example: need to divide 7 bytes file into 2-bytes chunks. N = 3, RestSize = 1. Three 2-bytes files and one 1-byte.

Upvotes: 1

Related Questions