John Bustos
John Bustos

Reputation: 19544

Excel data extraction - Issue with column data type

I am writing a C# library to read in Excel files (both xls and xlsx) and I'm coming across an issue.

Exactly the same as what was expressed in this question, if my Excel file has a column that has string values, but has a numeric value in the first row, the OLEDB provider assumes that column to be numeric and returns NULL for the values in that column that are not numeric.

I am aware that, as in the answer provided, I can make a change in the registry, but since this is a library I plan to use on many machines and don't want to change every user's registry values, I was wondering if there is a better solution.

Maybe a DB provider other than ACE.OLEDB (and it seems JET is no longer supported well enough to be considered)?

Also, since this needs to work on XLS / XLSX, options such as EPPlus / XML readers won't work for the xls version.

Upvotes: 7

Views: 3458

Answers (1)

Cory
Cory

Reputation: 1802

Your connection string should look like this

Provider=Microsoft.ACE.OLEDB.12.0;Data Source=c:\myFolder\myExcelfile.xlsx;Extended Properties="Excel 12.0 Xml;HDR=YES;IMEX=1";

IMEX=1 in the connection string is the part that you need to treat the column as mixed datatype. This should work fine without the need to edit the registry.

HDR=Yes is simply to mark the first row as column headers and is not needed in your particular problem, however I've included it anyways.

To always use IMEX=1 is a safer way to retrieve data for mixed data columns.

Source: https://www.connectionstrings.com/excel/

Edit:

Here is the data I'm using:

data

Here is the output:

enter image description here

This is the exact code I used:

string connString = @"Provider=Microsoft.ACE.OLEDB.12.0;Data Source=C:\test.xlsx;Extended Properties=""Excel 12.0 Xml;HDR=YES;IMEX=1""";

using (DbClass db = new DbClass(connString))
{
    var x = db.dataReader("SELECT * FROM [Sheet1$]");
    while (x.Read())
    {
        for (int i = 0; i < x.FieldCount; i++)
            Console.Write(x[i] + "\t");
        Console.WriteLine("");
    }
}

The DbClass is a simple wrapper I made in order to make life easier. It can be found here:

http://tech.reboot.pro/showthread.php?tid=4713

Upvotes: 0

Related Questions