Tom3609
Tom3609

Reputation: 11

C# Open XML SDK 2.0 Excel spreadsheet - load range of cells from string array

In a .Net windows desktop application, I am able to import a string array of data into a range of cells located in Excel spreadsheet. The C# code is as follows:

using Excel = Microsoft.Office.Interop.Excel;

// Create Application, Workbook, and Worksheet
xlApp = new Microsoft.Office.Interop.Excel.Application();
xlWorkBook = xlApp.Workbooks.Add(misValue);
xlWs = (Excel.Worksheet)xlWorkBook.Worksheets.get_Item(1);

// Move data from temp array into Excel spreadsheet.
Excel.Range c1 = (Excel.Range)xlWs.Cells[startRowNum, 1];
Excel.Range c2 = (Excel.Range)xlWs.Cells[startRowNum + myTable.Rows.Count - 1,   Columns.Count];
Excel.Range range = xlWs.get_Range(c1, c2);
range.Value = tempArray;

I am trying to duplicate this process in an ASP.Net webpage. Using Visual Studio 2010 and C#. How can I do this same import of a string array into an Open XML SDK 2.0 Excel spreadsheet range of cells?

Upvotes: 1

Views: 15280

Answers (2)

Vincent Tan
Vincent Tan

Reputation: 3166

As answered, a library might be easier to use. An alternative library choice is SpreadsheetLight. Here's how the code might look like:

SLDocument sl = new SLDocument("YourFile.xlsx");
// row 1 column 1
sl.SetCellValue(1, 1, "String1");
// row 1 column 2
sl.SetCellValue(1, 2, "String2");
sl.SaveAs("AnotherFile.xlsx");

You don't have to worry about which order you set the cell values. Internally, SpreadsheetLight runs on Open XML SDK. Disclaimer: I wrote SpreadsheetLight.

Upvotes: 4

JPW
JPW

Reputation: 1211

Using the OpenXML SDK directly instead of going through Excel's automation model is much more complex and error-prone. Therefore I recommend using a library for this; espcially if your task gets more complex (e.g. http://excelpackage.codeplex.com/). Edit: An example on doing something similar using ExcelPackage can be found here: http://excelpackage.codeplex.com/wikipage?title=Using%20a%20template%20to%20create%20an%20Excel%20spreadsheet Although I have no idea on what performance to expect compared to using the raw SDK, I would guess that ExcelPackage is using the SDK internally anyway and therefore it might produce some overhead. Probably only a measurement for you concrete scenario can provide a definite answer here.

If you want to stick to the SDK though here is an example for inserting strings into an Excel workbook:

string filePath = "workbook.xlsx";
string sheetName = "Sheet1";
uint startRow = 9;
string columnName = "C";
string[] data = new string[] { "A", "B", "C" };

using (var spreadsheetDocument = SpreadsheetDocument.Open(filePath, true))
{
    // Find the Id of the worksheet in question
    var sheet = spreadsheetDocument.WorkbookPart.Workbook
        .Sheets.Elements<Sheet>()
        .Where(s => s.Name == sheetName).First();
    var sheetReferenceId = sheet.Id;

    // Map the Id to the worksheet part
    WorksheetPart worksheetPart = (WorksheetPart)spreadsheetDocument.WorkbookPart.GetPartById(sheetReferenceId);
    var sheetData = worksheetPart.Worksheet.GetFirstChild<SheetData>();

    // Inset the data at the given location
    for (uint i = 0; i < data.Length; i++)
    {
        uint rowNumber = startRow + i;

        // Find the XML entry for row i
        var row = sheetData.Elements<Row>().Where(r => r.RowIndex == rowNumber).FirstOrDefault();
        if (row == null)
        {
            // Row does not exist yet, create it
            row = new Row();
            row.RowIndex = rowNumber;

            // Insert the row at its correct sequential position
            Row rowAfter = null;
            foreach (Row otherRow in sheetData.Elements<Row>())
            {
                if (otherRow.RowIndex > row.RowIndex)
                {
                    rowAfter = otherRow;
                    break;
                }
            }

            if (rowAfter == null)
                // New row is the last row in the sheet
                sheetData.Append(row);
            else
                sheetData.InsertBefore(row, rowAfter);
        }

        // CellReferences in OpenXML are "normal" Excel cell references, e.g. D15
        string cellReference = columnName + rowNumber.ToString();

        // Find cell in row
        var cell = row.Elements<Cell>()
            .Where(c => c.CellReference == cellReference)
            .FirstOrDefault();
        if (cell == null)
        {
            // Cell does not exist yet, create it
            cell = new Cell();
            cell.CellReference = new StringValue(cellReference);

            // The cell must be in the correct position (e.g. column B after A)
            // Note: AA must be after Z, so a normal string compare is not sufficient
            Cell cellAfter = null;
            foreach (Cell otherCell in row.Elements<Cell>())
            {
                // This is ugly, but somehow the row number must be stripped from the
                // cell reference for comparison
                string otherCellColumn = otherCell.CellReference.Value;
                otherCellColumn = otherCellColumn.Remove(otherCellColumn.Length - rowNumber.ToString().Length);

                // Now compare first to length and then alphabetically
                if (otherCellColumn.Length > columnName.Length ||
                    string.Compare(otherCellColumn, columnName, true) > 0)
                {
                    cellAfter = otherCell;
                    break;
                }
            }

            if (cellAfter == null)
                // New cell is last cell in row
                row.Append(cell);
            else
                row.InsertBefore(cell, cellAfter);
        }

        // Note: This is the most simple approach.
        //       Normally Excel itself will store the string as a SharedString,
        //       which is more difficult to implement. The only drawback of using
        //       this approach though, is that the cell might have been the only
        //       reference to its shared string value, which is not deleted from the
        //       list here.
        cell.DataType = CellValues.String;
        cell.CellValue = new CellValue(data[i]);
    }
}

Please note that this example is not perfect as it does not consider complex scenarios at all (e.g. styling, print margin, merged cells...). For production use you might want to extract certain functionality into a method (for example inserting a row or cell at the correct position) or even into a class all by itself (e.g. the part about comparing cell references in correct order).

Edit: The performance of using the SDK instead of going through the Automation Model is MUCH better (this is probably the second huge advantage of the SDK, the first being that you do not neeed to have Excel installed). If you still see performance bottlenecks here are some ideas for improvement:

  1. Make sure that the cells are already created in the template document (e.g. insert a 0 into each of the using Excel). This saves you the effort to create new cells.
  2. Make sure that none of the cells already exists in the template and then create all of the cells in sequence (saving you the effort to find the correct insert position at least for the row).
  3. Manipulate the XML directly, which is more difficult and requires even more knowledge of the inner workings of OpenXML. This is probably acceptable for rather uniform documents.

Upvotes: 4

Related Questions