Pierre Roudaut
Pierre Roudaut

Reputation: 1073

Is there a way to know if a filename is an Excel format?

My question may seem quite trivial but despite my numerous research, i haven't found an answer.

Is there a way in .NET to know if a filename is an Excel spreadsheet ?

I am not interested in the specific extension (.xls, .xlsx, etc), i would just like to know if the file is an excel generic spreadsheet.

Upvotes: 0

Views: 2989

Answers (3)

Edgar
Edgar

Reputation: 1137

You need to read the file Header bytes in order to know exactly what kind of file it is.

This library here FileTypeDetective does exactly what you want, but looks like the project is no longer active. Anyway it and can be easily adapted/corrected once you get the idea.

See:

// MS Office files
        public readonly static FileType WORD = new FileType(new byte?[] { 0xEC, 0xA5, 0xC1, 0x00 }, 512, "doc", "application/msword");
        public readonly static FileType EXCEL = new FileType(new byte?[] { 0x09, 0x08, 0x10, 0x00, 0x00, 0x06, 0x05, 0x00 }, 512, "xls", "application/excel");
        public readonly static FileType PPT = new FileType(new byte?[] {0xFD, 0xFF, 0xFF, 0xFF, null, 0x00, 0x00, 0x00  }, 512, "ppt", "application/mspowerpoint");

All you have to do is to find a common signature among all excel files.

My guess is that this library still works very well. I see no reason for these headers have changed since 2012 (last release).

Upvotes: 4

Vyacheslav Volkov
Vyacheslav Volkov

Reputation: 4742

Long ago, I wrote something similar here is the code:

private enum Extensions
{
    Unknown = 0,
    DocOrXls,
    Pdf,
    Jpg,
    Png,
    DocxOrXlsx,
}

private static readonly Dictionary<Extensions, string> ExtensionSignature = new Dictionary<Extensions, string>
    {
        {Extensions.DocOrXls, "D0-CF-11-E0-A1-B1-1A-E1"},
        {Extensions.Pdf, "25-50-44-46"},
        {Extensions.Jpg, "FF-D8-FF-E"},
        {Extensions.Png, "89-50-4E-47-0D-0A-1A-0A"},
        {Extensions.DocxOrXlsx, "50-4B-03-04-14-00-06-00"}
    };

private static string GetExtension(byte[] bytes)
{
    if (bytes.Length < 8)
        return string.Empty;
    var signatureBytes = new byte[8];
    Array.Copy(bytes, signatureBytes, signatureBytes.Length);
    string signature = BitConverter.ToString(signatureBytes);
    Extensions extension = ExtensionSignature.FirstOrDefault(pair => signature.Contains(pair.Value)).Key;
    switch (extension)
    {
        case Extensions.Unknown:
            return string.Empty;
        case Extensions.DocOrXls:
            if (bytes.Length < 512)
                break;
            signatureBytes = new byte[4];
            Array.Copy(bytes, 512, signatureBytes, 0, signatureBytes.Length);
            signature = BitConverter.ToString(signatureBytes);
            if (signature == "EC-A5-C1-00")
                return ".doc";
            return ".xls";
        case Extensions.Pdf:
            return ".pdf";
        case Extensions.Jpg:
            return ".jpg";
        case Extensions.Png:
            return ".png";
        case Extensions.DocxOrXlsx:
            string fileBody = Encoding.UTF8.GetString(bytes);
            if (fileBody.Contains("word"))
                return ".docx";
            if (fileBody.Contains("xl"))
                return ".xlsx";
            break;
        default:
            throw new ArgumentOutOfRangeException();
    }
    return string.Empty;
}

Upvotes: 1

miguelarcilla
miguelarcilla

Reputation: 1456

You could create a try-catch statement and see if Excel can open the file:

using Microsoft.Office.Interop.Excel;

....

try
{
    Application app = new Application();
    Workbook book = app.Workbooks.Open(@workbookPath); //@workbookpath is the file path
}
catch
{
    //Excel encountered an error opening the file at the path
}

Upvotes: 2

Related Questions