Sergey Shafiev
Sergey Shafiev

Reputation: 4375

Low performance when reading data from Excel workbook to ArrayList in C#

Here's a problem description.

CONDITIONS: General idea is to read a lot of real numbers from MS Excel file and put them inro ArrayList for further processing. An excel workbook has only one worksheet. All the numbers are real and they are stored in one column. I read these numbers row by row and put them into ArrayList.

PROBLEM: the process takes too much time. Program spends about 2 minutes to fill an ArrayList with 10000 elements. Here's my code. I need your advise to make it faster. But the structure of the file cannot be modified. It's only possible to modify code. Help me please to make it faster.

// Method GetExcelData opens 1 excel file, reads data row by row and adds
// it into the array of source Data Values (sourceDataValues in our case).
private void GetExcelData(string fullPath, ArrayList arrForValues)
{
    Excel.Application excelapp = new Excel.Application();
    excelapp.Visible = false;
    // to avoid appearing of Excel window on the screen
    Excel.Workbook excelappworkbook = excelapp.Workbooks.Open(
        fullPath,
        Type.Missing, Type.Missing, true, Type.Missing,
        Type.Missing, Type.Missing, Type.Missing, Type.Missing,
        Type.Missing, Type.Missing, Type.Missing, Type.Missing,
        Type.Missing, Type.Missing);
    Excel.Worksheet excelworksheet = (Excel.Worksheet)excelappworkbook.Worksheets.get_Item(1);
    Excel.Range excelcells = excelworksheet.UsedRange;
    uint rowsNum = 0;
    for (rowsNum = 1; rowsNum != excelcells.Rows.Count; rowsNum++)
    {
        arrForValues.Add((excelcells.Cells[rowsNum, 1] as Excel.Range).Value2);
    }
    excelappworkbook.Close(false, Type.Missing, Type.Missing);
    excelapp.Quit();
}

Upvotes: 1

Views: 6541

Answers (4)

Sergey Shafiev
Sergey Shafiev

Reputation: 4375

The problem is resolved. Everything is qute simple. First, we read all the range of current worksheet into simple two-dimension array - worksheetValuesArray. After that we put all the values from that array into our container, converting the type of elements to double. Here's the part of corrected solution:

private void GetExcelData(string fullPath, List<double> arrForValues)
        {
            Excel.Application excelapp = new Excel.Application();
            excelapp.Visible = false;
            // to avoid appearing of Excel window on the screen
            Excel.Workbook excelappworkbook = excelapp.Workbooks.Open(
                fullPath,
                Type.Missing, Type.Missing, true, Type.Missing,
                Type.Missing, Type.Missing, Type.Missing, Type.Missing,
                Type.Missing, Type.Missing, Type.Missing, Type.Missing,
                Type.Missing, Type.Missing);
            Excel.Worksheet excelworksheet = (Excel.Worksheet)excelappworkbook.Worksheets.get_Item(1);
            Excel.Range excelcells = excelworksheet.UsedRange;
            object[,] worksheetValuesArray = excelcells.get_Value(Type.Missing);

            for (int col = 1; col < (worksheetValuesArray.GetLength(1)+1); col++)
            {
                for (int row = 1; row < (worksheetValuesArray.GetLength(0)+1); row++)
                {
                    arrForValues.Add((double) worksheetValuesArray[row, col]);
                }
            }
            excelappworkbook.Close(false, Type.Missing, Type.Missing);
            excelapp.Quit();
        }

Upvotes: 5

Rob Haupt
Rob Haupt

Reputation: 2144

I tweaked the for loop. See if this yields better results.

    // Method GetExcelData opens 1 excel file, reads data row by row and adds
    // it into the array of source Data Values (sourceDataValues in our case).
    private void GetExcelData(string fullPath, ArrayList arrForValues)
    {

        Microsoft.Office.Interop.Excel.Application excelapp = new Microsoft.Office.Interop.Excel.Application();
        excelapp.Visible = false;
        // to avoid appearing of Excel window on the screen
        Microsoft.Office.Interop.Excel.Workbook excelappworkbook = excelapp.Workbooks.Open(
            fullPath,
            Type.Missing, Type.Missing, true, Type.Missing,
            Type.Missing, Type.Missing, Type.Missing, Type.Missing,
            Type.Missing, Type.Missing, Type.Missing, Type.Missing,
            Type.Missing, Type.Missing);
        Microsoft.Office.Interop.Excel.Worksheet excelworksheet = (Microsoft.Office.Interop.Excel.Worksheet)excelappworkbook.Worksheets.get_Item(1);
        Microsoft.Office.Interop.Excel.Range excelcells = excelworksheet.UsedRange;
        Microsoft.Office.Interop.Excel.Range newRange = excelworksheet.get_Range("A1","A"+excelcells.Rows.Count);
        object[,] items = newRange.Value;
        for (int i = 1; i < items.Length; i++)
        {
            arrForValues.Add(items[i,1]);
        }
        excelappworkbook.Close(false, Type.Missing, Type.Missing);
        excelapp.Quit();
    }

Upvotes: 1

harlam357
harlam357

Reputation: 1491

I don't know if you're gonna find much more performance. Excel interop is just slow (due to marshaling across COM boundaries I assume). I have gained some performance in my code (especially in Excel 2007 and higher) by setting the following.

excelapp.ScreenUpdating = false;

and

excelapp.Calculation = Excel.XlCalculation.xlCalculationManual;

Upvotes: 0

Paul
Paul

Reputation: 2530

My experience with Excel automation is that it is always slow. I usually try an alternative method, such as saving it as a CSV and reading the data with a stream reader and splitting the string on a delimiter (comma, tab, etc). I would suggest looking at the process of receiving your data and see if there is another format readily available.

Upvotes: 1

Related Questions