Muneeb Zulfiqar
Muneeb Zulfiqar

Reputation: 1023

parse excel file best practise

I am facing an issue in parsing excel file. My file has more than 5000 rows. When I parse it, its taking ages I wanted to ask if there's any better way to do so.

public static List<List<List<string>>> ExtractData(string filePath)
{
    List<List<List<string>>> Allwork = new List<List<List<string>>>();
    Microsoft.Office.Interop.Excel.Application excelApp = new Microsoft.Office.Interop.Excel.Application();
    Microsoft.Office.Interop.Excel.Workbook workBook = excelApp.Workbooks.Open(filePath);


    foreach (Microsoft.Office.Interop.Excel.Worksheet sheet in workBook.Worksheets)
    {

        List<List<string>> Sheet = new List<List<string>>();

        Microsoft.Office.Interop.Excel.Range usedRange = sheet.UsedRange;

        //Iterate the rows in the used range
        foreach (Microsoft.Office.Interop.Excel.Range row in usedRange.Rows)
        {
            List<string> Rows = new List<string>();
            String[] Data = new String[row.Columns.Count];
            for (int i = 0; i < row.Columns.Count; i++)
            {
                try
                {
                    Data[i] = row.Cells[1, i + 1].Value2.ToString();
                    Rows.Add(row.Cells[1, i + 1].Value2.ToString());
                }
                catch
                {
                    Rows.Add("     ");

                }
            }
            Sheet.Add(Rows);
        }
        Allwork.Add(Sheet);

    }
    excelApp.Quit();
    return Allwork;
}

This is my code.

Upvotes: 0

Views: 2704

Answers (4)

Govert
Govert

Reputation: 16907

One approach is to use something like the ClosedXML library to directly read the .xlsx file, not going through the Excel interop.

Upvotes: 0

Jas
Jas

Reputation: 1141

If that's an option, and if your tables have a simple structure, I would suggest to try exporting the file to .csv and applying simple string processing logic.

You might also want to try out the Igos's sugestion.

Upvotes: 0

Marko
Marko

Reputation: 2734

Your issue is that you are reading one cell at a time, this is very costly and inefficient try reading a range of cells.

Simple example below

Excel.Range range = worksheet.get_Range("A"+i.ToString(), "J" + i.ToString());

System.Array myvalues = (System.Array)range.Cells.Value;

string[] strArray = ConvertToStringArray(myvalues);

A link to basic example Read all the cell values from a given range in excel

Upvotes: 3

Igos
Igos

Reputation: 211

I suggest not use interop, but odbc connection for getting excel data. This will allow you to treat excel file as database and use sql statements to read needed data.

Upvotes: 2

Related Questions