user2881691
user2881691

Reputation: 23

Get column names from excel file of a specific sheet using c# with OleDbConnection

So far I have managed to get the column names of the whole excel file, but what I would like to do is to get the column names of the excel file of a given table (sheet). How could I modify the code to achieve this. I have been trying for a while now with no positive results, any help much appreciated.

public static List<String> ReadSpecificTableColumns(string filePath, string sheetName)
    {
        var columnList = new List<string>();
        try
        {
            var excelConnection = new OleDbConnection("provider=Microsoft.ACE.OLEDB.12.0;Data Source='" + filePath + "';Extended Properties='Excel 12.0;IMEX=1'");
            excelConnection.Open();
            var columns = excelConnection.GetOleDbSchemaTable(OleDbSchemaGuid.Columns, null);
            excelConnection.Close();

            if (columns != null)
            {
                columnList.AddRange(from DataRow column in columns.Rows select column["Column_name"].ToString());
            }

        }
        catch (Exception exception)
        {
            Console.WriteLine(exception.Message);
        }

        return columnList;
    }

Upvotes: 2

Views: 24057

Answers (5)

daniyal_1363
daniyal_1363

Reputation: 1

This function easily returns the columns of your sheet using OleDbConnection:

Public Function GetColumnNames(Optional SheetName As String = Nothing) As DataTable
    Try
        Dim dt As New DataTable
        
        Using OleCon = New OleDbConnection(ConnectionString)
        
            OleCon.Open()

            dt = OleCon.GetOleDbSchemaTable(OleDbSchemaGuid.Columns,
                    New String() {Nothing, Nothing, SheetName})

            OleCon.Close()
        
        End Using
        
        Return dt
        
    Catch ex As Exception
        Return Nothing
    End Try
End Function

Upvotes: 0

Brandon
Brandon

Reputation: 1

I had to go dig this up since it was not in the first click in search engine.

It is much better to go to the schema tables of the OleDb since it defines the schema the provider gave for the OleDb connection. Most are querying the Table and getting the schema information from there, and if the table is very large the query could take more time. Time might be irrelevant since we are talking about excel tables and most excel table sizes are small.

The Original poster was mostly there, they just needed a select statement on their schema columns table to select the schema data for their table:
DataRow[] shtRows = schemaDT.Select("[TABLE_NAME] = 'Sheet1$'");

This solution is provider independent and there is no need to guess or know how the schema of your dataset is created.

The OleDb contains tables that outline the schema definition of the importing data. Since the provider does a bunch of items behind the scenes (i.e. Auto forces data type to number even if you specify force to text [IMEX=1] due to it taking a sample of first few entries).

The schema tables can be retrieved by the GetOleDbSchemaTable() function. Below the [XXX] can be replace with any item from the OleDbSchemaGuid class. To answer the question here Columns was used:

    DataTable schemaDT = oleConn.GetOleDbSchemaTable(OleDbSchemaGuid.[XXX], null);

Columns -- Returns the columns of tables (including views) defined in the catalog that is accessible to a given user.

This can pull out the column names:

    using System;
    using System.Linq;
    using System.Data;
    using System.Data.OleDb;
    using System.Collections.Generic;

    namespace ConsoleApp1
    {
        class Program
        {
            static void Main(string[] args)
            {
                string excelFilePath = @"C:\myexcelfile.xlsx";
                string providerString = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source={0};Extended Properties=\"Excel 12.0 Xml; HDR=YES;IMEX=1\";";
               
                List<string> columnNames = new List<string>();
                using (OleDbConnection oleConn = new OleDbConnection(String.Format(providerString, excelFilePath)))
                {
                    oleConn.Open();
                    DataTable schemaDT = oleConn.GetOleDbSchemaTable(OleDbSchemaGuid.Columns, null);
                    DataRow[] shtRows = schemaDT.Select("[TABLE_NAME] = 'Sheet1$'");
                    columnNames = shtRows.Select(o => o.Field<string>("COLUMN_NAME")).ToList();
                }

                Console.WriteLine(String.Join("\n", columnNames));
                Console.ReadLine();
            }
        }
    }

Upvotes: 0

Bartosz Kurczyński
Bartosz Kurczyński

Reputation: 33

What about using such snippet:

var adapter = new OleDbDataAdapter("SELECT * FROM [" +sheetName + "$A1:Z1]", excelConnection);
var table = new DataTable();
adapter.Fill(table);

For connection string with "HDR=Yes" it will read first row and the destination table will have all columns but no data.

Upvotes: 2

Mayank Sehgal
Mayank Sehgal

Reputation: 96

You can do something like this:

    private void ValidateExcelColumns(string filePath)
    {            
        var connectionString = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + filePath + ";Extended Properties=\"Excel 12.0;IMEX=1;HDR=Yes;TypeGuessRows=0;ImportMixedTypes=Text\""; ;
        using (var conn = new OleDbConnection(connectionString))
        {
            conn.Open();

            DataTable dt = new DataTable();
            var sheets = conn.GetOleDbSchemaTable(System.Data.OleDb.OleDbSchemaGuid.Tables, new object[] { null, null, null, "TABLE" });
            using (var cmd = conn.CreateCommand())
            {
                cmd.CommandText = "SELECT TOP 1 * FROM [" + sheets.Rows[0]["TABLE_NAME"].ToString() + "] ";
                var adapter = new OleDbDataAdapter(cmd);
                adapter.Fill(dt);
            }

            foreach(DataColumn column in dt.Columns)
            {
                //Do something with your columns
            }
        }
    }

Upvotes: 0

asdf_enel_hak
asdf_enel_hak

Reputation: 7630

You did not include sheet name in your code.
You can try below code:

var adapter = new OleDbDataAdapter("SELECT * FROM [" +sheetName + "$]", excelConnection);
var ds = new DataSet();
adapter.Fill(ds, "myTable");
DataTable data = ds.Tables["myTable"];

foreach(DataColumn  dc in data.Columns){
...
}

Upvotes: 3

Related Questions