Nathu
Nathu

Reputation: 262

Reading very large data from an Excel 2013 file using OleDB range error

I am trying to read an Excel 2013 file (.xlsx which is about 100 MB in size) using Visual Basic.NET with the help of OleDB. The main concern was getting a system out of memory exception in the line:

da.Fill(dt)

from the code below.

Private Function ReadExcelFile() As DataSet
    Dim ds As New DataSet()

    Dim connectionString As String =
    "Provider=Microsoft.ACE.OLEDB.12.0;;Extended Properties=Excel 12.0 XML;Data Source=C:\file.xlsx;"

    Using connection As New OleDbConnection(connectionString)
        connection.Open()
        Dim cmd As New OleDbCommand()
        cmd.Connection = connection
        Dim dtSheet As DataTable = connection.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, Nothing)

        For Each dr As DataRow In dtSheet.Rows
            Dim sheetName As String = dr("TABLE_NAME").ToString()
            If Not sheetName.EndsWith("$") Then
                Continue For
            End If

            cmd.CommandText = "SELECT * FROM [" & sheetName & "];"
            Dim dt As New DataTable()
            dt.TableName = sheetName
            Dim da As New OleDbDataAdapter(cmd)
            da.Fill(dt)
            ds.Tables.Add(dt)
        Next

        cmd = Nothing
        connection.Close()
    End Using
    Return ds
End Function

but I think the best solution is to read the data by chunks so I found out that I could read the data by adding a column range in the SQL statement like this:

 cmd.CommandText = "SELECT * FROM [" & sheetName & "B1:B10];"

I did a loop by doing increments on that range but I found an error. With this for example,

cmd.CommandText = "SELECT * FROM [" & sheetName & "B50000:B51000];"

It still works. However, if I do,

cmd.CommandText = "SELECT * FROM [" & sheetName & "B70000:B70001];"

I get this error.

OleDb Error

Note that the Excel file has 475128 rows and B70000-B70001 isn't even half the total.

Total Columns

Could anyone shed some light? I think I'm missing something here.

Upvotes: 2

Views: 1293

Answers (1)

Nathu
Nathu

Reputation: 262

I found a working solution. Instead of using DataSet, use DataReader. I can just add a worker so it won't hang up.

 Private Function ReadExcelFile() As DataSet
    Dim ds As New DataSet()

    Dim connectionString As String = GetConnectionString()

    Using connection As New OleDbConnection(connectionString)
        connection.Open()
        Dim cmd As New OleDbCommand()
        cmd.Connection = connection
        Dim dtSheet As DataTable = connection.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, Nothing)

        For Each dr As DataRow In dtSheet.Rows
            Dim sheetName As String = dr("TABLE_NAME").ToString()
            If Not sheetName.EndsWith("$") Then
                Continue For
            End If
            cmd.CommandText = "SELECT * FROM [" & sheetName & "];"
            Dim ddr As OleDbDataReader = cmd.ExecuteReader()
            Dim counter As Integer = 0
            While (ddr.Read())
                MessageBox.Show(ddr.GetValue(0))
            End While
        Next
        cmd = Nothing
        connection.Close()
    End Using
    Return ds
End Function

The lines:

Dim ddr As OleDbDataReader = cmd.ExecuteReader()
Dim counter As Integer = 0
While (ddr.Read())
     MessageBox.Show(ddr.GetValue(0))
End While

are the essential code where you can access the rows of the first column (index 0). This works because I read that DataSet is an in-memory object (Which is the reason why we possible can get a system out of memory exception) - Check here for reference

I'd still want to hear about why my above problem is popping up.

Upvotes: 1

Related Questions