orkutWasNotSoBad
orkutWasNotSoBad

Reputation: 648

Why does bsdiff.exe have trouble with this smaller file?

I'm building a software patch using bsdiff.exe and applying it with bspatch.exe and have so far had no trouble with files smaller than 120MB. One binary file I have was previously 21MB and is now 77MB, and bsdiff seems to hang indefinitely on it.

According to the documentation, "bsdiff is quite memory-hungry. It requires max(17*n,9*n+m)+O(1) bytes of memory, where n is the size of the old file and m is the size of the new file." This explains the problem with large files, but the issue seems to occur when the delta is larger.

Does anyone have any information regarding this? Anything would be helpful, thanks!

Upvotes: 2

Views: 2588

Answers (2)

Eric Leschinski
Eric Leschinski

Reputation: 153842

Try one of the other binary diffing programs listed here:

https://stackoverflow.com/questions/688504/binary-diff-tool-for-very-large-files

The differences between the two files require memory above and beyond the memory required to represent both files. So processing two binary files with many differences will require more memory than two identical files.

It has trouble with the smaller file because there is a bug in the software. Colin Percival, the guy who wrote it has acknowledged the bug and said he doesn't have time to fix it.

https://www.daemonology.net/bsdiff/

Upvotes: 1

Graeme Johnson
Graeme Johnson

Reputation: 51

I also had a problem with bsdiff crashing when trying to process a file containing just 2MB of DSP executable code.

After some debugging I determined that the issue lies within the qsufsort function which is used to create a suffix array based on the "old" file. qsufsort calls a function called split which calls itself recursively. In the crash case the recursive call happens so many times that the program runs out of stack space and throws an exception.

As suggested by this thread: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=409664 The solution is to replace qsufsort with a different solution to produce a suffix array. The Wikipedia entry for suffix arrays references SA-IS and so I downloaded the source from here: https://sites.google.com/site/yuta256/sais

I then rebuilt bsdiff.c together with sais.c and sais.h and replaced the call to qsufsort with :

I[0] = oldsize; sais(old, I+1, oldsize);

Now bsdiff works every time and it's quicker too!

Upvotes: 5

Related Questions