Rotem
Rotem

Reputation: 21917

DotNetZip Corruption after Rename

I am using the DotNetZip library to rename a file within a zip archive.

Here is the code I am using:

void RenameFile(string existingArchive, string oldName, string newName)
{
    using (ZipFile zip = ZipFile.Read(existingArchive))
    {
        ZipEntry targetFile = zip.Entries.FirstOrDefault(f =>
            f.FileName.ToLower() == oldName.ToLower());

        //targetFile is not null.

        targetFile.FileName = newName;
        zip.Save();
    }
}

The issue I'm having is that after the rename operation, the zip is corrupted. When I browse it using WinRAR, I have a duplicate entry for the file I renamed (both of the entries are identical, with the new name). When I try to extract it, WinRAR throws a CRC error on the two identical entries.

I've tried saving to a different filename, but without luck.

I am using v1.9.1.8 of the library (the reduced version).

Update:

Removing one of the duplicate entries after renaming seems to work and has no visible side effects. Still, a less hacky approach would be welcome.

void RenameFile(string existingArchive, string oldName, string newName)
{
    using (ZipFile zip = ZipFile.Read(existingArchive))
    {
        ZipEntry targetFile = zip.Entries.FirstOrDefault(f =>
            f.FileName.ToLower() == oldName.ToLower());

        //targetFile is not null.

        targetFile.FileName = newName;

        //if the file was duplicated, remove one of the duplicates:
        var dupFiles = zip.Where(f => f.FileName == newName).ToList();
        if (dupFiles.Count > 1)
        {
            zip.RemoveEntries(dupFiles.Skip(1).ToList());
        }

        zip.Save();
    }
}

Update 2:

As the answers suggest the source file might be the problem, I've uploaded a sample zip file (24kb) here

This is the code I am using to verify it suffers from this issue:

using (ZipFile zip = ZipFile.Read("test.zip"))
{
    ZipEntry entry = zip.Entries.FirstOrDefault(e => Path.GetFileName(e.FileName) == "test.max");
    entry.FileName = Path.Combine(Path.GetDirectoryName(entry.FileName), "newname.max");
    zip.Save(@"test2.zip");
}

Upvotes: 1

Views: 2350

Answers (2)

Daniel A.A. Pelsmaeker
Daniel A.A. Pelsmaeker

Reputation: 50326

I can't reproduce your problem with a random ZIP file I had lying around and the latest stable 1.9.1.8 Reduced precompiled binary or the latest Reduced source code. The code you posted results in a ZIP archive I can successfully open and extract files from using both Windows Explorer and 7zip (I don't have WinRAR).

Note that both Anders Gustafsson's approach and your approach use the same ZipFile's internal data structures, so there should be no difference and your code should just work.

public ZipEntry this[String fileName]
{
    get
    {
        if (_entries.ContainsKey(fileName))
            return _entries[fileName];
        return null;
    }
}

public ICollection<ZipEntry> Entries
{
    get { return _entries.Values; }
}

Maybe it's an issue with the source ZIP file you're using, or with the full path of that ZIP (too long?), or the old and new filenames you chose? Or perhaps you have another application open that somehow prevents DotNetZip from removing the old entry?


Update: problem found!

The problem was indeed unique to your ZIP file, but actually it is a bug in DotNetZip.

When the library reads file entries from the ZIP file, it takes the filename exactly as specified by the ZIP file and stores it in the Filename property of the ZipEntry, and then stores this filename as the key in the _entries dictionary. In your case, the Filename property of your file entry has the following value:

C:\Users\rotem\Documents\3dsmax\scenes\test.max

Then, when you change the value of the Filename property, DotNetZip removes the current entry from _entries and re-adds the entry with the new filename.

To remove the current entry, it normalizes the current filename and removes the entry with that filename from _entries. So, it is trying to remove an entry with this normalized filename as the key:

Users/rotem/Documents/3dsmax/scenes/test.max

Of course, there is no key in _entries with that filename. The entry that it meant to remove has a C:\ prefix and all backward slashes. So, the old entry doesn't get removed.

But DotNetZip doesn't notice this and continues. It sets the entry's Filename to the normalized new filename:

Users/rotem/Documents/3dsmax/scenes/newname.max

And then re-adds the entry to the _entries dictionary. Now there are two entries for the same file entry: one with the normalized new filename, and one with the unnormalized old filename. And that's where all your problems come from.

I recommend that DotNetZip asserts that it really removed exactly one entry on renaming. And it should normalize any filename that is assigned to the ZipEntry.Filename property or its backing field, or that is used as a key in _entries.


Issue #16130 has been created on the DotNetZip CodePlex page.

Upvotes: 2

Anders Gustafsson
Anders Gustafsson

Reputation: 15971

Your approach seems reasonable, so there might very well be a bug in the DotNetZip library.

BUT If you look at the C# Examples on the DotNetZip CodePlex site, section Update a zip archive, you'll see that the recommended (?) approach for renaming an existing entry in an archive is:

zip[oldName].FileName = newName;

This means that your method would change to this:

void RenameFile(string existingArchive, string oldName, string newName)
{
    using (ZipFile zip = ZipFile.Read(existingArchive))
    {
        zip[oldName].FileName = newName;
        zip.Save();
    }
}

As far as I can tell from my tests, the above approach works as intended, and the indexer property this[string fileName] appears to make a case-insensitive file name match

Upvotes: 0

Related Questions