OpenXML documents, how do you know which is which, when there is no extention

What i have done for now, and which works is this:

private string DetermineOpenXML(string file)
    {

        try
        {
            SpreadsheetDocument doc = SpreadsheetDocument.Open(file, false);
            doc.Close();
            return ".xslx";
        }
        catch
        {
            try
            {
                WordprocessingDocument doc = WordprocessingDocument.Open(file, false);
                doc.Close();
                return ".docx";
            }
            catch
            {
                try
                {
                    PresentationDocument doc = PresentationDocument.Open(file, false);
                    doc.Close();
                    return ".pptx";
                }
                catch
                {
                    return string.Empty;
                }
            }
        }
    }

I think there must be a better way to see what kind of file it is other than just trial and error. The thing is, i am working on a small program, that finds out what file extension files should have. The reason i do this is because i have the files from a database, where they are saved sometimes without extension and other times with a wrong extension.

What i have done with these files is that i found out that all OpenXML documents share the same File signature: "50 4B 03 04 14 00 06 00", which is close to a signature of a zip file, and i can also open OpenXML files with a zip program and see its content. And maybe this is the solution i should go for, i was just hoping that it would be faster / easier to use OpenXML SDK and that it had a property or something that could check it for me.

Edit: I have added a answer, i would still like to see if there was a better solution, even though there it works for my current purpose. It does not take in for account that the extensions should have been template files.

Upvotes: 4

Views: 1730

Answers (2)

I ended up using System.IO.Packaging instead.

private string anotherOpenXmlAttempt(string file)
    {
        string ext = string.Empty;
        Package package = Package.Open(file);
        if (package.PartExists(new Uri("/word/document.xml", UriKind.Relative)))
        {
            ext = ".docx";
        }
        else if (package.PartExists(new Uri("/xl/workbook.xml", UriKind.Relative)))
        {
            ext = ".xslx";
        }else if (package.PartExists(new Uri("/ppt/presentation.xml", UriKind.Relative)))
        {
            ext = ".pptx";
        }

        package.Close();
        return ext;
    }

havn't done any extensive testing, but have worked for my current files.

I will leave the question open in case someone has a nice solution.

Upvotes: 4

JoeKir
JoeKir

Reputation: 1117

From my experience of the OpenXMLSDK2 it is more useful for manipulating xml internals of the document. If you just need the extension type, then why not just use:

string extension = System.IO.Path.GetExtension(filename);

Its worth noting that the try catch is an expensive approach for just determining external details, as it will need all the exception details, stack trace etc for the catch block.

also Excel's extension type is .xslx not .xslt, that is "extensible stylesheet language transformations"

Hope that helps!

Upvotes: 0

Related Questions