SuperJMN
SuperJMN

Reputation: 13972

SelectMany taking a huge amount of memory using ReactiveExtensions

I want to create a pipeline that takes images and returns some derived objects.

I'm using a sequence of bitmaps and for each of them I perform the task (that is asynchronous). So it's as simple as it seems. However, I found out that the memory consumption is REALLY HIGH. To illustrate the problem I've created this test that you can run.

Please, take a look at the memory because it will take up to 400 MB of RAM.

What can I do to avoid taking so much memory? What's happening here?

[Fact]
public async Task BitmapPipelineTest()
{
    var bitmaps = Enumerable.Range(0, 100).Select(_ => new WriteableBitmap(800, 600, 96, 96, PixelFormats.Bgr24, new BitmapPalette(new List<Color>() { new Color() })));
    var bitmapsObs = bitmaps.ToObservable();

    var processed = bitmapsObs.SelectMany(bitmap => DoSomethingAsync(bitmap));
    processed.Subscribe();

    await Task.Delay(20000);
}

private async Task<object> DoSomethingAsync(BitmapSource bitmap)
{
    await Task.Delay(1000);
    return new object();
}

Upvotes: 3

Views: 432

Answers (2)

Enigmativity
Enigmativity

Reputation: 117057

It seems like to me that you're running into a simple memory usage issue.

If there are 4 bytes per channel and 4 channels per pixel then your 1000 images at 800 x 600 each are 1000 x 800 x 600 x 4 x 4 = 733MB (approx.).

What strikes me, though, in your code that could be giving you grief is that you're starting with a enumerable, then turning it into an observable, which is built using tasks, which, in the end, you run asynchronously with a fire and forget .Subscribe() and you fudge the return with an await Task.Delay(20000);. It's all prone to errors. You should avoid mixing your "monads".

Here's how I would write it:

public async Task BitmapPipelineTest()
{
    await
        Observable
            .Range(0, 100)
            .Select(_ => new WriteableBitmap(
                800, 600, 96, 96,
                PixelFormats.Bgr24,
                new BitmapPalette(new List<Color>() { new Color() })))
            .SelectMany(x =>
                Observable
                    .Start(() =>
                    {
                        Thread.Sleep(10);
                        return new object();
                    }));
}

Upvotes: 1

Jason Boyd
Jason Boyd

Reputation: 7029

So I don't think the issue is necessarily due to SelectMany or even reactive extensions. It looks like the WriteableBitmap uses unmanaged memory: source code. I believe the issue is that you are, in very rapid succession, creating a bunch of relatively small managed objects that take up a much larger amount of unmanaged memory. From the MSDN:

If a small managed object allocates a large amount of unmanaged memory, the runtime takes into account only the managed memory, and thus underestimates the urgency of scheduling garbage collection.

But we can give the garbage collector hints by using the GC.AddMemoryPressure and GC.RemoveMemoryPressure functions. This will help the GC improve its scheduling. Before we can do that we need to have some idea of the amount of unmanaged memory being allocated. I believe the unmanaged memory is used to store the pixel array so I think a good estimate is the pixel width times the pixel height times the number of bits in each channel times the number of channels. From the MSDN it looks like there are 32 bits (4 bytes) per channel and 4 channels.

I ran some tests using code similar to the following and got really good results:

var processed = 
    Enumerable
    .Range(0, 100)
    .Select(_ => new WriteableBitmap(
        800, 
        600, 
        96, 
        96, 
        PixelFormats.Bgr24, 
        new BitmapPalette(new List<Color>() { new Color() })))
    .Select(x => new { Bitmap = x, ByteSize = x.PixelWidth * x.PixelHeight * 4 * 4)
    .ToObservable()
    .Do(x => GC.AddMemoryPressure(x.ByteSize))
    .SelectMany(x => DoSomethingAsync(x.Bitmap));

processed
.Subscribe(x => GC.RemoveMemoryPressure(x.ByteSize));

However, if your source is publishing bitmaps faster than you can handle them then you are still going to have issues. The back pressure will cause memory to be allocated faster than it can be deallocated.

Honestly though, are you really having bitmaps pushed to you? I have no idea what your actual program looks like but in your example code that is clearly a pull based system. If it is a pull based system have you considered PLINQ? PLINQ is great for this type of thing; it gives you really good control over concurrency and you will not have to worry about back pressure.

Upvotes: 2

Related Questions