Reputation: 1098
I have to apply simple formatting to 10,000+ Excel files. I already have a multicore program working. It opens a new Excel instance for every Excel file. If it matters, this code is currently inside of my Form code.
I was hoping to use one Excel instance with many workbooks. Is it possible to make use of multicore capabilities if I only use one instance of Excel? How?
-If the answer to the above is no, a more complicated question might be: How many instances of Excel should I spawn, and how can I split the workbooks between each instance?
Current code here:
private void SelectFilesButtonClick(object sender, EventArgs e)
{
var listOfExcelFiles = OpenExcel.FileNames.ToList();
Parallel.ForEach(listOfExcelFiles, TrivialExcelEditFunction);
}
private void TrivialExcelEditFunction(string file)
{
//Open instance of Excel
//Do processing
//Close instance of Excel
}
Updated code below, but still doesn't limit itself to the proper number of cores. Not sure why.
private void SelectFilesButtonClick(object sender, EventArgs e)
{
var listOfExcelFiles = OpenExcel.FileNames.ToList();
int cores = Environment.ProcessorCount;
//Split one list into list of lists. Number of lists based on number of cpu cores
List<List<object>> listOfLists = Split(listOfExcelFiles, cores);
//Limits number of threads to number of cores
Parallel.ForEach(listOfLists, new ParallelOptions { MaxDegreeOfParallelism = cores }, EditExcel);
}
private void TrivialExcelEditFunction(string file)
{
//Open instance of Excel
foreach (string file in files)
{
//Do processing
}
//Close instance of Excel
}
Lets say I have 4 cores. My thinking was to divide list of files into 4 equal lists, limit threads to 4, then I'd be able to process files in 4 instances of Excel. I thought that would mean TrivialExcelEditFunction would only run 4 times. Instead this function is being run anywhere between 14 and 27 times. Please tell me where I went wrong.
Upvotes: 3
Views: 1029
Reputation: 11040
As @Servy pointed out you can create multiple threads with one Excel instance per thread.
Make sure each thread creates the the Excel object it's using.
Expect weirdness. With 10,000+ files you can probably expect minor issues in at least a few. If a hidden Excel instance tries to prompt the user it may appear frozen.
Word has some issues when using some functions that call into single-threaded COM objects, I suspect Excel may have some of those too. In Word these manifest in various ways, including the instance freezing or being closed.
If the formatting is really simple and your files are xmlx it may be feasible writing some code to apply the changes through OOXML SDK, which wouldn't require an actual Excel instance
Upvotes: 2
Reputation: 2337
I think this is what @Servy is referring to above. I use this daily at the moment in a piece of code I am working with which does handle Excel and most certainly hasn't failed yet. Also make sure that you marshal you COM object correctly. A little extra info as well, Excel 2010 uses multi-core internally, watch out for performance (with large file especially).
var tasks = new Task[Environment.ProcessorCount];
for (int i = 0; i< Environment.ProcessorCount; i++)
{
tasks [i] = Task.Factory.StartNew(() =>
{
// your Excel code here.
});
}
Task.WaitAll(tasks);
Upvotes: 1
Reputation: 203802
Create N tasks/threads to do the processing where "N" is the number of cores on your machine. Give each task/thread one instance of Excel
.
You won't be able to control a single instance from multiple threads (at least not productively; it will only be working on one thread's task at a time), and creating so many instances of Excel is quite inefficient.
Upvotes: 2