Blue Bot
Blue Bot

Reputation: 2438

Getting `write too long` error when trying to create tar.gz file from file and directories

So I'm trying to crate a tar.gz file file from multiple directories and files. Something with the same usage as:

tar -cvzf sometarfile.tar.gz somedir/ someotherdir/ somefile.json somefile.xml

Assuming the directories have other directories inside of them. I have this as an input:

    paths := []string{
      "somedir/",
      "someotherdir/",
      "somefile.json",
      "somefile.xml",
    }

and using these:

    func TarFilesDirs(paths []string, tarFilePath string ) error {
       // set up the output file
       file, err := os.Create(tarFilePath)
       if err != nil {
           return err
       }

       defer file.Close()
       // set up the gzip writer
       gz := gzip.NewWriter(file)
       defer gz.Close()

       tw := tar.NewWriter(gz)
       defer tw.Close()

       // add each file/dir as needed into the current tar archive
       for _,i := range paths {
          if err := tarit(i, tw); err != nil {
               return err
          }
       }

       return nil
   }

func tarit(source string, tw *tar.Writer) error {
    info, err := os.Stat(source)
    if err != nil {
        return nil
    }

    var baseDir string
    if info.IsDir() {
        baseDir = filepath.Base(source)
    }

    return filepath.Walk(source,
        func(path string, info os.FileInfo, err error) error {
            if err != nil {
                return err
            }

            header, err := tar.FileInfoHeader(info, info.Name())
            if err != nil {
                return err
            }

            if baseDir != "" {
                header.Name = filepath.Join(baseDir, strings.TrimPrefix(path, source))
            }

            if err := tw.WriteHeader(header); err != nil {
                return err
            }

            if info.IsDir() {
                return nil
            }

            file, err := os.Open(path)
            if err != nil {
                return err
            }

            defer file.Close()

            _, err = io.Copy(tw, file)
            if err != nil {
                log.Println("failing here")
                return err
            }

            return err
        })
}

Problem: if a directory is large I'm getting:

archive/tar: write too long

error, when I remove it everything works.

Ran out of ideas and wasted many hours on this trying to find a solution...

Any ideas?

Thanks

Upvotes: 11

Views: 12919

Answers (3)

Rohit Durvasula
Rohit Durvasula

Reputation: 46

Since you are seeing this issue with a large directory only, I think the following fix might not help, but this will address the issue of creating a tar from files that might be continuously growing.

In my case the issue was that when we were creating the tar header, the header.Size (inside tar.FileInfoHeader) was getting setting to the file size (info.Size()) at that instant of time.

When we later in the code, try to open the concerned file (os.Open) and copy its contents (io.Copy) we risk copying more data than what we earlier set the tar header size to because the file could have grown in the meantime.

This piece of code will ensure we ONLY copy that much data as we set the tar header size to:

_, err = io.**CopyN**(tw, file, info.Size())
if err != nil {
    log.Println("failing here")
    return err
}

Upvotes: 2

user7009112
user7009112

Reputation:

I was having a similar issue until I looked more closely at the tar.FileInfoHeader doc:

FileInfoHeader creates a partially-populated Header from fi. If fi describes a symlink, FileInfoHeader records link as the link target. If fi describes a directory, a slash is appended to the name. Because os.FileInfo's Name method returns only the base name of the file it describes, it may be necessary to modify the Name field of the returned header to provide the full path name of the file.

Essentially, FileInfoHeader isn't guaranteed to fill out all the header fields before you write it with WriteHeader, and if you look at the implementation the Size field is only set on regular files. Your code snippet seems to only handle directories, this means if you come across any other non regular file, you write the header with a size of zero then attempt to copy a potentially non-zero sized special file on disk into the tar. Go returns ErrWriteTooLong to stop you from creating a broken tar.

I came up with this and haven't had the issue since.

    if err := filepath.Walk(directory, func(path string, info os.FileInfo, err error) error {
        if err != nil {
            return check(err)
        }

        var link string
        if info.Mode()&os.ModeSymlink == os.ModeSymlink {
            if link, err = os.Readlink(path); err != nil {
                return check(err)
            }
        }

        header, err := tar.FileInfoHeader(info, link)
        if err != nil {
            return check(err)
        }

        header.Name = filepath.Join(baseDir, strings.TrimPrefix(path, directory))
        if err = tw.WriteHeader(header); err != nil {
            return check(err)
        }

        if !info.Mode().IsRegular() { //nothing more to do for non-regular
            return nil
        }

        fh, err := os.Open(path)
        if err != nil {
            return check(err)
        }
        defer fh.Close()

        if _, err = io.CopyBuffer(tw, fh, buf); err != nil {
            return check(err)
        }
        return nil
})

Upvotes: 17

TehSphinX
TehSphinX

Reputation: 7440

Write writes to the current entry in the tar archive. Write returns the error ErrWriteTooLong if more than hdr.Size bytes are written after WriteHeader.

There is a Size option you can add to the Header. Haven't tried it but maybe that helps...

See also https://golang.org/pkg/archive/tar/

Upvotes: 0

Related Questions