Reputation: 135
I am generating transaction reports which can contain a large amount of data. But the issue is that excel only support the max rows of 1,048,576.
I am using EPPlus v6.2.3. It is crashing with the error Index Out of Bound on large data. I want to add new sheet in the workbook if data exceeds the max row count. How can I do that?
Below is my code snippet:
public static ExcelPackage ExportToExcel(ExcelPackage excelPackage, DataTable dataTable, ExcelCellAddress startCellAddress)
{
//Create the worksheet
ExcelWorksheet ws = excelPackage.Workbook.Worksheets[0];
ws.Cells[startCellAddress.Address].LoadFromDataTable(dataTable, false);
// Calculate the final column letter
string finalColLetter = string.Empty;
string colCharset = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
int colCharsetLen = colCharset.Length;
if (dataTable.Columns.Count > colCharsetLen)
{
finalColLetter = colCharset.Substring((dataTable.Columns.Count - 1) / colCharsetLen - 1, 1);
}
finalColLetter += colCharset.Substring((dataTable.Columns.Count - 1) % colCharsetLen, 1);
string cellRange = string.Format("{0}:{1}{2}", startCellAddress.Address, finalColLetter, startCellAddress.Row + dataTable.Rows.Count - 1);
ExcelRange range = ws.Cells[cellRange];
// Assign borders
SetNormalCellStyle(range);
return excelPackage;
}
Upvotes: 1
Views: 1209
Reputation: 135
Finally I have achieved to add worksheet as the data exceeds the max no. of allowed rows in excel, using EPPlus and without looping on the large dataset.
Here is the sample code:
//Create the worksheet
ExcelWorksheet ws = null;
int prevAddedRows = 0;
int excelMaxRows = 1048576;
int totalRows = dataTable.Rows.Count;
int maxCount = excelMaxRows;
int startIndex = 0;
int remainingRowsCount;
ExcelCellAddress startCell = startCellAddress;
int totalSheets = (totalRows / excelMaxRows) + 1;
for (int i = 0; i < totalSheets; i++)
{
if (i == 0)
ws = excelPackage.Workbook.Worksheets[i];
else
{
ws = excelPackage.Workbook.Worksheets.Add(string.Format("Sheet{0}", i + 1));
startCell = new ExcelCellAddress("A1");
}
DataTable selectedDataTable = dataTable.AsEnumerable().Skip(startIndex).Take(maxCount).CopyToDataTable();
ws.Cells[startCell.Address].LoadFromDataTable(selectedDataTable, false);
prevAddedRows += maxCount;
remainingRowsCount = totalRows - prevAddedRows;
if (remainingRowsCount < excelMaxRows)
{
maxCount = remainingRowsCount;
}
startIndex = prevAddedRows;
}
return excelPackage;
}
Upvotes: 1
Reputation: 432
You can divide you data from multiple sheets by creating a sheet in a loop
Whenever the data reaches your threshold of 1,048,576 start a new sheet
Without sample code on your question I took the liberty to craft a simple example that should work regardless of the package or version:
using OfficeOpenXml;
(...)
int worksheetIndex = 0,
row = 1,
dataPerWorksheet = 1_000_000,
dataItems = 3_000_000;
ExcelWorksheet currentWorksheet = null!;
for (int i = 0; i < dataItems; i++)
{
//check if you need a new sheet
if (i % dataPerWorksheet == 0)
{
worksheetIndex++;
currentWorksheet = package.Workbook.Worksheets.Add($"Data{worksheetIndex}");
row = 1;
}
//Add data to the row on current sheet
currentWorksheet.Cells[row, 1].Value = i; //I used i as the data
row++;
}
With the addition of boiler plate this gave me one excel file with 3 worksheets filled with continuous count from 0 to 3_000_000, each sheet holding 1_000_000 rows of data.
I am working with OpenOffice but this should work with little to none modifications on Excel
Regarding your edit, you can do the same with your example by modifying the above example to get data from DataTable
For example:
int worksheetIndex = 0,
row = startCellAddress.Row,
col = startCellAddress.Column
dataPerWorksheet = 1_000_000;
//Declare worksheet
ExcelWorksheet ws = null;
for (int r = 0; r < dataTable.Rows.Count; r++)
{
if (r % dataPerWorksheet == 0)
{
worksheetIndex++;
ws = excelPackage.Workbook.Worksheets.Add($"Data{worksheetIndex}");
row = startCellAddress.Row;
}
ws.Cells[Row:row, Col:col].Value = dataTable.Rows[r]["A"];
ws.Cells[Row: row, Col: col+1].Value = dataTable.Rows[r]["B"];
row++;
}
Regarding the comment: you need to loop over each row to copy the data from one row to the other. The method LoadFromDataTable()
probably loops over all rows as well, just not on your side of the code.
Regarding performance I am not sure if there are better ways to do this, I am not an expert, but it took much longer to open the file than to create it.
Result
//rows in data= 1_500_000
//dataPerWorksheet = 500_000
ExportToExcel(package, data, new("B2"));
Your method with the multiple worksheets modification took 08s15ms and the file has 19,4 MB (20 348 432 bytes) with my dataset.
Upvotes: 1
Reputation: 29578
If you are willing to consider an alternative to EPPlus, I maintain an Excel library that might work for you: Sylvan.Data.Excel.
Here is a complete C# 11 example showing how to write data that exceeds the limit of a single sheet:
using Sylvan.Data;
using Sylvan.Data.Excel;
using System.Diagnostics;
var sw = Stopwatch.StartNew();
// synthesize some data, and expose it as a DbDataReader
var data =
Enumerable.Range(0, 2_000_000)
.Select(i => new { Id = i, Date = DateTime.Today.AddHours(i), Value = Math.PI * i });
System.Data.Common.DbDataReader dr = data.AsDataReader();
// create an ExcelDataWriter, which can also write .xlsb files
using var edw = ExcelDataWriter.Create("report.xlsx");
bool done = false;
int page = 1;
do
{
// write the data to the excel file
var result = edw.Write(dr, "Report_" + page++);
// result will identify the number of rows written
// and IsComplete indicating if the entire DbDataReader was consumed.
done = result.IsComplete;
} while (!done);
sw.Stop();
var size = new FileInfo("report.xlsx").Length;
Console.WriteLine($"Created {size} file in {sw.Elapsed}.");
On my machine, this produces a ~30MB file in about 3.8 seconds.
Created 30313078 file in 00:00:03.8263647.
Upvotes: 0