TheQuestioner
TheQuestioner

Reputation: 447

Huge Memory and CPU usage with Multi-threading

I'm trying to create a media player that has a playlist option. When a load 10-20 songs there is no problem. So I tried something more demanding: I tried to load 2048 songs (I took several songs and copied them a lot of times). Trying to load them in my media player, my CPU and Ram memory had grown at over 95 % (loading just the first 250 songs) and one time my computer even restarted. So I tried to slow the operation by using something that doesn't let the application to take over the computer : I stop loading new songs if the CPU load is over 85 % and the Memory load over 90 % (I use a 64-bit operating system with Windows 8 if that matters). It somehow worked at the beginning permitting me to load almost 600 songs and then:

A first chance exception of type 'System.InvalidOperationException' occurred in mscorlib.dll
A first chance exception of type 'System.InvalidOperationException' occurred in mscorlib.dll
A first chance exception of type 'System.InvalidOperationException' occurred in mscorlib.dll
A first chance exception of type 'System.InvalidOperationException' occurred in mscorlib.dll
A first chance exception of type 'System.InvalidOperationException' occurred in mscorlib.dll
The thread 'vshost.NotifyLoad' (0x1d0c) has exited with code 0 (0x0).
The thread 'vshost.LoadReference' (0x1e48) has exited with code 0 (0x0).
A first chance exception of type 'System.OutOfMemoryException' occurred in mscorlib.dll
A first chance exception of type 'Microsoft.VisualStudio.Debugger.Runtime.CrossThreadMessagingException' occurred in Microsoft.VisualStudio.Debugger.Runtime.dll

Finally the application stoped at "An unhandled exception of type 'System.OutOfMemoryException' occurred in mscorlib.dll".

Now to explain what does "loading a song" means in my application:

  1. a thread that iterates through every song loaded from the OpenFileDialog and checks if the extension of the file is known and if it is (and in this case it is: mp3) it merges the path of the file at the end of a Queue.
  2. another thread verifies if there are any element in the Queue.
  3. if there are, it extracts the first element and if the CpuLoad and the MemoryLoad (calculated by another thread) are not too high, it starts a new thread that makes some operations (presented at 4)
  4. the thread that makes operations loads the song in a System.Windows.Media.MediaPlayer class and verifies the next things : the TimeSpan of the file, if the file has audio and if the file has video and remembers those 3 variables along with the path of the file in a List.
  5. there is also another thread that verifies if there are threads that have finished their job and have added the media file to the List and if there are, then it removes the reference to them so that the garbage collector will take care of them.

The "An unhandled exception of type 'System.OutOfMemoryException' occurred in mscorlib.dll" appears to the next line :

MediaCreator[idx].CreatorThread.Start();

That would be the line that starts the threads that process the songs. So I made the next thing: before the line posted above, I added Thread.Sleep(100);. It worked (doing this actually resulted in loading all the 2048 files), except the fact that (according to a Stopwatch I added) it took 3 minutes and 28 seconds to load all the songs. Also, I know that Thread.Sleep usually it's not the recommended method and I know that same people even believe to be a prove of weak programming skills (and I somehow agree with them). I don't want to use this method either because it obviously takes a long time and it's untrustworthy to work on every computer/cpu/hdd/ram. To prove the untrustworthiness of this I tested with Sleep(10) with which it fails quickly, and with Sleep(20) with which it loads almost 1000 songs before it fails again. I also tried reducing the CPU load to as little as 15 % and the memory load to 80 %, but no more than 1300 songs were loaded (this also proved inefficiently because there were short peaks of 60% for the CPU load).

I also want to mention that Winamp loaded all the files in just a bit over 30 seconds using around 11 % of my CPU (from 5% to 16%) and less than 40 MB of memory.

So my question is : how should I proceed ? I could limit the number of threads so that no more than X threads to be running at the same type, but this also seems to me a prove of weak programming skills as not every CPU could hold the same amount of running threads. So what should I do ? I really need to take those details from the songs - as long as possible with as little resources as possible (to know how long they are, and if they are audio or video files : here I should mention that my application also plays movie, it's just that I don't think that anyone needs to load thousands of movies at once in application and if I resolve the problem for audio, it will be resolve for videos also because the songs are differentiated by movies only before they are stored in the list - so nothing that would conflict with the solution of my problem). I really need help in getting this sorted out.

EDIT: I also attach some diagnostics showed by ANTS Performance Profiles:

  1. http://s24.postimg.org/e3e8cfcit/image1.png
  2. http://s24.postimg.org/71gaq88x1/image2.png

Upvotes: 1

Views: 3589

Answers (3)

Corey
Corey

Reputation: 16574

This is one of those common problems with threading that people face over and over. Under normal conditions (a reasonable number of files in this case) the process appears to work fine. Unfortunately, the more files you process the more overheads - threads, handles, MediaPlayer instances, etc - until eventually you run out of resources.

The other downside of too many threads, which will happen long before your system runs out of resources, is disk contention. When you have lots of threads trying to read from different parts of the drive, the hard drive will be forced to spend more time seeking to different locations and less time actually reading data. To further compound the problem, the more threads you have running the more time the drive is spending servicing virtual memory.

Long story short, using hundreds or thousands of threads is A Bad Idea™.

Instead of creating a thread for every file, use a thread pool of a known size (say 10 threads) and recycle those threads to do the work. Or let the threads pull their own data out of a thread-safe collection - I use a thread-safe encapsulation around the Queue<T> class - and send results out to another thread-safe collection, then waits until more data is ready.

Upvotes: 1

Jim Mischel
Jim Mischel

Reputation: 134005

You don't say how you're starting the threads, but it sounds like you're creating a thread (i.e. new Thread(...) and starting it. If that's the case, you're creating hundreds or possibly thousands of threads, each of which is trying to load a song and verify it. This causes some serious problems:

  1. Having all those songs in memory at one time will very likely cause you to run out of memory.
  2. With hundreds of threads, the computer spends a lot of time doing thread context switches, letting thread 1 run for a bit, then thread 2, then 3, etc. It's quite possible that your computer is thrashing--spending more time doing thread context switches than doing actual work.
  3. The disk drive from which you're loading files can only do one thing at a time. If two threads ask to load a file, one of them will have to wait. Because reading the file probably takes longer than any processing, it's unlikely that having multiple threads doing this work is gaining you much.

Your design is seriously over-complicated. You can simplify it, reduce your memory requirements, and probably increase your processing speed by using a single thread. But if one thread is two slow, you probably want no more than three:

One thread (the main thread) gets the file names, checks them, and places them on a queue. This is step 1 on your list.

Two consumer threads read the queue and do the rest of the processing. Each of these consumer threads waits on the queue and performs step 4 (loading the file, doing the processing, and adding the result to a list).

This kind of thing is incredibly easy to do with a BlockingCollection, which is a concurrent queue. The basic idea is:

// this is the output list
List<MusicRecord> ProcessedRecords = new List<MusicRecord>();
object listLock = new object();  // object for locking the list when adding

// queue of file names to process
BlockingCollection<string> FilesToProcess = new BlockingCollection<string>();

// code for main thread

// Start your consumer threads here.

List<string> filesList = GetFilesListFromOpenDialog(); // however you do this
foreach (string fname in filesList)
{
    if (IsGoodFilename(fname))
    {
        string fullPath = CreateFullPath(fname);
        FilesToProcess.Add(fullPath); // add it to the files to be processed
    }
}
// no more files, mark the queue as complete for adding
// This marks the "end of the queue" so that clients reading the queue
// know when to stop.
FilesToProcess.CompleteAdding();

// here, wait for threads to complete

The code for your threads is pretty simple:

foreach (var fname in FilesToProcess.GetConsumingEnumerable())
{
    // Load file and process it, creating a MusicRecord
    // Then add to output
    lock (listLock)
    {
        ProcessedRecord.Add(newRecord);
    }
}

That's all the thread needs to do. The GetConsumingEnumerable handles waiting (non-busy) on the queue, de-queueing an item, and exiting when the queue is known to be empty.

With this design, you can start with a single consumer thread and scale up to as many as you need. However, it won't make sense to have more threads than you have CPU cores, and as I said before the limiting factor will quite likely be your disk drive.

Upvotes: 5

ServerMonkey
ServerMonkey

Reputation: 1154

Step 4 onwards doesn't make sense to me. I would suggest you stay with the multiple threads for the file loading and gathering information and also stay with the threads queuing the files and working out file lengths etc.

The change I would make is to have a single consumer which reads from the playlist queue and actually plays the files, I don't see why you are using threads in this section.

If using a single consumer doesn't work can you please expand on this and we'll try and assist.

Upvotes: 0

Related Questions