ciscoheat
ciscoheat

Reputation: 3947

Can immutability be a memory hog?

Let's say we have a memory-intensive class like an Image, with chainable methods like Resize() and ConvertTo().

If this class is immutable, won't it take a huge amount of memory when I start doing things like i.Resize(500, 800).Rotate(90).ConvertTo(Gif), compared to a mutable one which modifies itself? How to handle a situation like this in a functional language?

Upvotes: 32

Views: 5775

Answers (6)

Gene T
Gene T

Reputation: 5176

Short, tangential answer: in FP language I'm familiar with (scala, erlang, clojure, F#), and for the usual data structures: arrays, lists, vectors, tuples, you need to understand shallow/deep copies and how implemented:

e.g.

Scala, clone() object vs. copy constructor

Does Scala AnyRef.clone perform a shallow or deep copy?

Erlang: message passing a shallow-copied data structure can blow up a process:

http://groups.google.com/group/erlang-programming/msg/bb39d1a147f72800

Upvotes: 0

Norman Ramsey
Norman Ramsey

Reputation: 202655

If this class is immutable, won't it take a huge amount of memory?

Typically your memory requirements for that single object might double, because you might have an "old copy" and a "new copy" live at once. So you can view this phenomenon, over the lifetime of the program, as having one more large object allocated than you might in a typical imperative program. (Objects that aren't being "worked on" just sit there, with the same memory requirements as in any other language.)

How to handle a situation like this in a functional language?

Do absolutely nothing. Or more accurately, allocate new objects in good health. If you are using an implementation designed for functional programming, the allocator and garbage collector are almost certainly tuned for high allocation rates, and everything will be fine. If you have the misfortune to try to run functional code on the JVM, well, performance won't be quite as good as with a bespoke implementation, but for most programs it will still be fine.


Can you provide more detail?

Sure. I'm going to take an exceptionally simple example: 1000x1000 greyscale image with 8 bits per pixel, rotated 180 degrees. Here's what we know:

  • To represent the image in memory requires 1MB.

  • If the image is mutable, it's possible to rotate 180 degrees by doing an update in place. The amount of temporary space needed is enough to hold one pixel. You write a doubly nested loop that amounts to

    for (i in columns) do
      for (j in first half of rows) do {
         pixel temp := a[i, j]; 
         a[i, j] := a[width-i, height-j]; 
         a[width-i, height-j] := tmp
      }
    
  • If the image is immutable, it's required to create an entire new image, and temporarily you have to hang onto the old image. The code is something like this:

    new_a = Image.tabulate (width, height) (\ x y -> a[width-x, height-y])
    

    The tabulate function allocates an entire, immutable 2D array and initializes its contents. During this operation, the old image is temporarily occupying memory. But when tabulate completes, the old image a should no longer be used, and its memory is now free (which is to say, eligible for recycling by the garbage collector). The amount of temporary space required, then, is enough to hold one image.

  • While the rotation is going on, there's no need to have copies of objects of other classes; the temporary space is needed only for the image being rotated.

N.B. For other operations such as rescaling or rotating a (non-square) image by 90 degrees, it is quite likely that even when images are mutable, a temporary copy of the entire image is going to be necessary, because the dimensions change. On the other hand, colorspace transformations and other computations which are done pixel by pixel can be done using mutation with a very small temporary space.

Upvotes: 29

Rob Lachlan
Rob Lachlan

Reputation: 14479

It depends on the type of data structures used, their application in a given program. In general, immutability does not have to be overly expensive on memory.

You may have noticed that the persistent data structures used in functional programs tend to eschew arrays. This is because persistent data structures typically reuse most of their components when they are "modified". (They are not really modified, of course. A new data structure is returned, but the old one is just the same as it was.) See this picture to get an idea of how the structure sharing can work. In general, tree structures are favoured, because a new immutable tree can be created out of an old immutable tree only rewriting the path from the root to the node in question. Everything else can be reused, making the process efficient in both time and memory.

In regards to your example, there are several ways to solve the problem other than copying a whole massive array. (That actually would be horribly inefficient.) My preferred solution would be to use a tree of array chunks to represent the image, allowing for relatively little copying on updates. Note an additional advantage: we can at relatively small cost store multiple versions of our data.

I don't mean to argue that immutability is always and everywhere the answer -- the truth and righteousness of functional programming should be tempered with pragmatism, after all.

Upvotes: 2

Tomas Petricek
Tomas Petricek

Reputation: 243106

In some cases, immutability forces you to clone the object and needs to allocate more memory. It doesn't necessary occupy the memory, because older copies can be discarded. For example, the CLR garbage collector deals with this situation quite well, so this isn't (usually) a big deal.

However, chaining of operations doesn't actually mean cloning the object. This is certainly the case for functional lists. When you use them in the typical way, you only need to allocate a memory cell for a single element (when appending elements to the front of the list).

Your example with image processing can be also implemented in a more efficient way. I'll use C# syntax to keep the code easy to understand without knowing any FP (but it would look better in a usual functional language). Instead of actually cloning the image, you could just store the operations that you want to do with the image. For example something like this:

class Image { 
  Bitmap source;
  FileFormat format;
  float newWidth, newHeight;
  float rotation;

  // Public constructor to load the image from a file
  public Image(string sourceFile) { 
    this.source = Bitmap.FromFile(sourceFile); 
    this.newWidth = this.source.Width;
    this.newHeight = this.source.Height;
  }

  // Private constructor used by the 'cloning' methods
  private Image(Bitmap s, float w, float h, float r, FileFormat fmt) {
    source = s; newWidth = w; newHeight = h; 
    rotation = r; format = fmt;
  }

  // Methods that can be used for creating modified clones of
  // the 'Image' value using method chaining - these methods only
  // store operations that we need to do later
  public Image Rotate(float r) {
    return new Image(source, newWidth, newHeight, rotation + r, format);
  }
  public Image Resize(float w, float h) {
    return new Image(source, w, h, rotation, format);
  }
  public Image ConvertTo(FileFormat fmt) {
    return new Image(source, newWidth, newHeight, rotation, fmt);
  }

  public void SaveFile(string f) { 
    // process all the operations here and save the image
  }
}

The class doesn't actually create a clone of the entire bitmap each time you invoke a method. It only keeps track of what needs to be done later, when you'll finally try to save the image. In the following example, the underlying Bitmap would be created only once:

 var i = new Image("file.jpg");
 i.Resize(500, 800).Rotate(90).ConvertTo(Gif).SaveFile("fileNew.gif");

In summary, the code looks like you're cloning the object and you're actually creating a new copy of the Image class each time you call some operation. However, that doesn't mean that the operation is memory expensive - this can be hidden in the functional library, which can be implemented in all sorts of ways (but still preserv the important referential transparency).

Upvotes: 3

anijhaw
anijhaw

Reputation: 9422

Yes one of the disadvantage of using immutable objects is that they tend to hog memory, One thing that comes to my mind is something similar to lazy evaluation, which is when a new copy is requested provide a reference and when the user does some changes then initialize the new copy of the object.

Upvotes: 1

Dan Story
Dan Story

Reputation: 10155

Yes. Immutability is a component of the eternal time-space tradeoff in computing: you sacrifice memory in exchange for the increased processing speed you gain in parallelism by foregoing locks and other concurrent access control measures.

Functional languages typically handle operations of this nature by chunking them into very fine grains. Your Image class doesn't actually hold the logical data bits of the image; rather, it uses pointers or references to much smaller immutable data segments which contain the image data. When operations need to be performed on the image data, the smaller segments are cloned and mutated, and a new copy of the Image is returned with updated references -- most of which point to data which has not been copied or changed and has remained intact.

This is one reason why functional design requires a different fundamental thought process from imperative design. Not only are algorithms themselves laid out very differently, but data storage and structures need to be laid out differently as well to account for the memory overhead of copying.

Upvotes: 18

Related Questions