Joshua Clayton
Joshua Clayton

Reputation: 1729

git-svn fetch fails on file whose size > LONG_MAX

I am trying to use git-svn to migrate from subversion.

right now I am blocked by the failure of

$ git svn fetch 

fails on line 900 of Git.pm (from the git-svn package)

...
    my $read = read($in, $blob, $bytesToReadd);

in the sub called cat_blob() The problem is that the file is 2567089913 bytes, and when git-svn gets to 2147484672 it chokes with a message "Offset outside of string". cat_blob tries to hold the entire file in a variable before writing it to disk.

I tried moving the writing of the file from the end of the sub to inside the read loop,

(here is what my modified code looks like)

890         my $size = $1;
891 
892         my $blob;
893         my $bytesRead = 0;
894 
895         while (1) {
896                 my $bytesLeft = $size - $bytesRead;
897                 last unless $bytesLeft;
898 
899                 my $bytesToRead = $bytesLeft < 1024 ? $bytesLeft : 1024;
900                 print $size, " ", $bytesLeft, " ", $bytesRead, "\n";
901                 my $read = read($in, $blob, $bytesToReadd);
902                 unless (defined($read)) {
903                         $self->_close_cat_blob();
904                         throw Error::Simple("in pipe went bad");
905                 unless (print $fh $blob) {
906                         $self->_close_cat_blob();
907                         throw Error::Simple("couldn't write to passed in filehandle");
908         }
909 
910                 }
911 
912                 $bytesRead += $read;
913         }

but now I get a new error:

Checksum mismatch: root/Instruments/MY_DIR/MASSIVE_FILE.exe bca43a9cb6c3b7fdb76c460781eb410a34b6b9ec
expected: 52daf59b450b82a541e782dbfb803a32
     got: d41d8cd98f00b204e9800998ecf8427e

I'm not a perl guy. Does perl put extra crap onto the print statement there? Any ideas how I can pass the checksum?

Upvotes: 2

Views: 293

Answers (1)

ikegami
ikegami

Reputation: 386361

The error becomes apparent when you fix the indenting.

890         my $size = $1;
891 
892         my $blob;
893         my $bytesRead = 0;
894 
895         while (1) {
896                 my $bytesLeft = $size - $bytesRead;
897                 last unless $bytesLeft;
898 
899                 my $bytesToRead = $bytesLeft < 1024 ? $bytesLeft : 1024;
900                 print $size, " ", $bytesLeft, " ", $bytesRead, "\n";
901                 my $read = read($in, $blob, $bytesToReadd);
902      --->       unless (defined($read)) {
903                     $self->_close_cat_blob();
904                     throw Error::Simple("in pipe went bad");
905      --->           unless (print $fh $blob) {
906                         $self->_close_cat_blob();
907                         throw Error::Simple("couldn't write to passed in filehandle");
908                     }
909 
910                 }
911 
912                 $bytesRead += $read;
913         }

The print is never reached. Just move 905-909 to 912.

Oh and you mispelled $bytesToRead as $bytesToReadd in line 901. Didn't the compiler pick that up?

You should use a block size larger than 1024. 64*1024 would be much faster.

Upvotes: 3

Related Questions