E Mett
E Mett

Reputation: 2302

Can only read Excel file when it is actually open in Ms Excel

I am using the following code to open an excel file (XLS) and populate a DataTable with the first worksheet:

var connectionString = string.Format("Provider=Microsoft.Jet.OLEDB.4.0; data source={0}; Extended Properties=Excel 8.0;", filename);

OleDbConnection connExcel = new OleDbConnection(connectionString);
connExcel.Open();
DataTable dtExcelSchema;
dtExcelSchema = connExcel.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);
string SheetName = dtExcelSchema.Rows[0]["TABLE_NAME"].ToString();
connExcel.Close();

var adapter = new OleDbDataAdapter("SELECT * FROM [" + SheetName + "]", connectionString);
var ds = new DataSet();
int count = 0;

adapter.Fill(ds, SheetName);

DataTable dt = ds.Tables[0];

It works only when the file is already open in Ms Excel. Why could that be?

If the file is not open, I get an error message (on line connExcel.Open): External table is not in the expected format.

Upvotes: 2

Views: 1437

Answers (3)

essemario
essemario

Reputation: 11

I had the same problem. If the file was open the read was ok but if the file was closed... some thing was strange... in my case I received strange data from columns and values.. Debugging I found the name of the first sheet and was strange ["xls _xlnm#_FilterDatabase"] looking on the internet I found that's a name of hidden sheet and a trick to avoid read this sheet (HERE) and so I've implemented a method:

private string getFirstVisibileSheet(DataTable dtSheet, int index = 0)
{
    string sheetName = String.Empty;
    if (dtSheet.Rows.Count >= (index + 1))
    {
        sheetName = dtSheet.Rows[index]["TABLE_NAME"].ToString();
        if (sheetName.Contains("FilterDatabase"))
        {
            return getFirstVisibileSheet(dtSheet, ++index);
        }
    }
    return sheetName;
}

To me worked very well.

My complete example code is:

string excelFilePath = String.Empty;
string stringConnection = String.Empty;

using (OpenFileDialog openExcelDialog = new OpenFileDialog())
{
    openExcelDialog.Filter = "Excel 2007 (*.xlsx)|*.xlsx|Excel 2003 (*.xls)|*.xls";
    openExcelDialog.FilterIndex = 1;
    openExcelDialog.RestoreDirectory = true;
    DialogResult windowsResult = openExcelDialog.ShowDialog();
    if (windowsResult != System.Windows.Forms.DialogResult.OK)
    {
        return;
    }

    excelFilePath = openExcelDialog.FileName;
    using (DataTable dt = new DataTable())
    {
        try
        {
            if (!excelFilePath.Equals(String.Empty))
            {
                stringConnection = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + excelFilePath + ";Extended Properties='Excel 8.0; HDR=YES;';";
                using (OleDbConnection conn = new OleDbConnection(stringConnection))
                {
                    conn.Open();
                    OleDbCommand cmd = new OleDbCommand();
                    cmd.Connection = conn;

                    DataTable dtSheet = conn.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, null);

                    string sheetName = getFirstVisibileSheet(dtSheet);

                    cmd.CommandText = "SELECT * FROM [" + sheetName + "]";
                    dt.TableName = sheetName;

                    OleDbDataAdapter da = new OleDbDataAdapter(cmd);
                    da.Fill(dt);

                    cmd = null;
                    conn.Close();
                }
            }

            //Read and Use my DT
            foreach (DataRow row in dt.Rows)
            {
                //On my case I need data on first and second Columns
                if ((row.ItemArray.Count() < 2) ||
                    (row[0] == null || String.IsNullOrWhiteSpace(row[0].ToString()))
                    ||
                    (row[1] == null ||String.IsNullOrWhiteSpace(row[1].ToString())))
                {
                    continue;
                }

                //Get the number from the first COL
                int colOneNumber = 0;
                Int32.TryParse(row[0].ToString(), out colOneNumber);

                //Get the string from the second COL
                string colTwoString = row[1].ToString();

                //Get the string from third COL if is a file path valid
                string colThree = (row.ItemArray.Count() >= 3 
                                && !row.IsNull(2) 
                                && !String.IsNullOrWhiteSpace(row[2].ToString()) 
                                && File.Exists(row[2].ToString())
                                ) ? row[2].ToString() : String.Empty;
            }
        }
        catch (Exception ex)
        {
            MessageBox.Show("Import error.\n" + ex.Message, "::ERROR::", MessageBoxButtons.OK, MessageBoxIcon.Error);
        }
    }
}

private string getFirstVisibileSheet(DataTable dtSheet, int index = 0)
{
    string sheetName = String.Empty;
    if (dtSheet.Rows.Count >= (index + 1))
    {
        sheetName = dtSheet.Rows[index]["TABLE_NAME"].ToString();
        if (sheetName.Contains("FilterDatabase"))
        {
            return getFirstVisibileSheet(dtSheet, ++index);
        }
    }
    return sheetName;
}

Upvotes: 1

Scare
Scare

Reputation: 81

I'm facing the same problem and accordingly to this site, many developers are struggling for the same:

-When I try read Excel with OLE DB all values are empty

-Can't connect to excel file unless file is already open

Actually I'm using the classic connection string (note that I'm trying to read a 97/2003 file):

Provider=Microsoft.Jet.OLEDB.4.0; Data Source = " + GetFilename(filename) + "; Extended Properties ='Excel 8.0;HDR=NO;IMEX=1'

but the file can be read properly only if:

  • Is open in Excel or even in Word! (the file of course appears corrupted and unreadable, but then the OleDb procedure can read every line of the file), I didn't try with other Office apps

  • The file is not in read-only mode

I also tried to lock the file manually or to open it with other non-office applications, but the result is not the same. If I follow the two previous rules (file opened in Word or Excel in not read-only mode) I can see all the cells, otherwise it seems the first column is ignored completely (so F2 became F1, F3 became F2,... and F6, the last one, should became F5 otherwise it throws and out-of-index error).

In order to keep compatibility with OleDb without using 3rd parties libraries I found a very stupid workaround using Microsoft.Office.Interop.Excel assembly.

Excel.Application _app = new Excel.Application();
var workbooks = _app.Workbooks;
workbooks.Open(_filename);

// OleDb Connection
using (OleDbConnection conn = new OleDbConnection(connectionOleDb))
            {
                try
                {
                    conn.Open();
                    OleDbCommand cmd = new OleDbCommand();
                    cmd.Connection = conn;

                    cmd.CommandText = String.Format("SELECT * FROM [{0}$]", tableName);
OleDbDataReader myReader = cmd.ExecuteReader();

                    int i = 0;
                    while (myReader.Read())
                    {
                        //Here I read through all Excel rows
                    }


                }
                catch (Exception E)
                {
                    MessageBox.Show("Error!\n" + E.Message);
                }
                finally
                {
                    conn.Close();

                    workbooks.Close();
                    if (workbooks != null)
                        System.Runtime.InteropServices.Marshal.ReleaseComObject(workbooks);
                    _app.Quit();
                    System.Runtime.InteropServices.Marshal.ReleaseComObject(_app);
                }
            }

Essentially the first 3 lines run an Excel instance that lasts exactly the time needed to OleDb to perform its tasks. The last 4 lines, inside the finally block, let the Excel instance to be closed correctly, immediately after the task and avoid ghost Excel processes.

I repeat it's a very stupid workaround that also requires a 1,5 MB dll (Microsoft.Office.Interop.Excel.dll) to be added to the project. Anyway seems impossible that OleDb cannot manage by itself the missing data...

Upvotes: 1

Roberto
Roberto

Reputation: 2194

Is it failing on ToString(), like here? Error is "Object reference not set to an instance of an object"

Does Convert.ToString() fix anything?

Upvotes: 0

Related Questions