Theodor Solbjørg
Theodor Solbjørg

Reputation: 756

Lazy Loading Bitmaps in ML.NET

All the examples i have found on ML.NET Image Classification are using Image Path's when training the pipeline, however, when in production. I would like to predict directly from a bitmap, so i have transformed the learning pipeline to use Bitmap instead of the path. this causes other problems, such as when we have an IEnumerable Dataset list with 615.000 Bitmaps loaded in-memory, well, this PC doesnt have enough RAM.

Is there a way to create an lazy loaded IEnumerable Dataset containing the Bitmap models when the pipeline is being Fit/Trained?

EDIT :

From the suggestion of JonasH I am just implementing my own Enumerator to handle loading the images at runtime. Here is the implementation:

public class ImageDataCollection : IEnumerable<ImageClassificationData>
{
    private IEnumerable<string> files { get; set; }
    public Func<Bitmap, Bitmap> Handler { get; set; }

    public ImageDataCollection(IEnumerable<string> files)
    {
        this.files = files;
    }

    public IEnumerator<ImageClassificationData> GetEnumerator()
    {
        IEnumerator<string> iterator = files.GetEnumerator();
        while (iterator.MoveNext())
        {
            string data = iterator.Current;
            Bitmap image = new Bitmap(data);
            if (Handler != null)
            {
                image = Handler(image);
            }
            string[] c = data.Split(new char[] { '\\' });
            yield return new ImageClassificationData { Label = c[c.Length - 1], Image = image };
            image.Dispose();
        }
    }

    IEnumerator IEnumerable.GetEnumerator()
    {
        return this.GetEnumerator();
    }
}

This WORKS however i would like to see it as an extension method of some sort in ML.NET, if that was possible, or at least documented since this was NOT made clear, DEEP Learning uses a LOT of RAM after all :P ..

Upvotes: 2

Views: 527

Answers (1)

JonasH
JonasH

Reputation: 36341

Linq statements are lazily evaluated. So if you have a list of paths and load the bitmaps with a select statement they should be loaded lazily unless the pipeline materalizes the IEnumerable with .ToList() or similar.

However, bitmaps uses unmanaged memory for storage and should be disposed unless you want to rely on the finalizer to free memory. You could perhaps use a iterator block that yields the bitmap and then disposes it, but this might fail if the framework keeps a reference between iterations, like in a parallel loop for example.

Upvotes: 2

Related Questions