Robert3452
Robert3452

Reputation: 1476

Why do we have clojure memoize function?

I am new to clojure and I have just learned and experimented with the memorize function. It seems to me the existence of this function is strange.

Firstly functions with side effects end with !

Secondly using memorize is very simple

Why doesn't clojure just do this for me? There is a balance between memory use and performance but you could easily get the clojure runtime to have a chunk of ram allocated to function results. If a function is called with the same arguments multiple times use cached results, if memory runs out clear out the cache and keep track of cache hits so frequently recalled functions are less likely to be removed from the cache.

If I designed this I would even set a minimum performance level for functions so that if a function call is quicker than cache retrieval it is not cached. (Or make this a property of how all function calls work.)

Can anyone explain why clojure doesn't do this

Thanks

Upvotes: 2

Views: 2274

Answers (2)

Jason
Jason

Reputation: 12283

Although @Arthur Ulfeldt gives a good answer about the difficulty of caching, there's another reason why Clojure doesn't - and can't - do as you suggest: Clojure does not manage memory. When Clojure is run on the JVM, the JVM manages memory. When Clojure is compiled to Javascript, the Javascript engine manages memory. Therefore, Clojure has limited knowledge and control of the runtime environment - the same limits you have as the programmer, in fact. Clojure relies on the underlying runtime environment to manage memory, so it cannot track how much memory is free.

It could, as you suggest, allocate a chunk for caching. Then the question becomes, how much to allocate, especially since Clojure is not responsible for memory management. Whatever chunk it decides to allocate would be taking away from application memory - your memory for your application. On the JVM there's a fixed amount of memory available for the application at startup - the maximum heap size. Generally, the maximum heap size is conservative and relatively small unless you change it. So having Clojure take a chunk of memory for caching without your knowledge or control would be presumptuous and probably frustrating to many programmers.

I believe the purpose of the memoize function is two fold: 1) an example; 2) useful for simple cases or short running applications. I generally avoid it unless I know my arguments will be few, repeated, and the function costs will be expensive. I do this because I know I'm going to lose that memory, forever, until my application quits. Some of my applications run for upward of a year.

Upvotes: 1

Arthur Ulfeldt
Arthur Ulfeldt

Reputation: 91554

There is an old joke in computer science that there are only two hard problems:

  1. cache invalidation
  2. naming things

The built in memoize function is a great start at what it does, and it's useful and sufficient in a few cases. It does however fit that joke above nicely. It's a somewhat awkward name (opinion here) and it fails very badly at cache invalidation. It assumes that if a function has ever been called that the result will always be relevant, and that the function is pure, and does not encounter errors, and that all calls are equally relevant for all time. The real world is full of nuances that turn out to be really important for caching:

  1. many return values are not useful forever
  2. many function are unbounded in scope (math for instance)
  3. many functions are faster than the cache (math again)
  4. functions have equivalent arguments. These should be in the same class.
  5. memoize doesn't actually guarantee to run a function only once.
  6. many functions are never going to be pure (like accepting a network connection)
  7. some arguments don't affect the return value.

All these things come into play when designing caching for web apps for instance. #5 is an interesting case, consider what happens if you memoize a very slow function, and there are two calls to it one second apart. Which return value becomes the memorized result? In what cases is this important. These details can really matter, especially if one of them encounters an unusual circumstance.

In the last eight years doing clojure professionally I have seem memoize used in production many times, and it's always been replaced in short order by a call to one of the functions in clojure.core.cache once the inevitable problems arise.

If you find yourself wanting memoize there's a good chance you will be happier with core.cache. It offers many more nuanced options to fit more of the real world cases.

Upvotes: 10

Related Questions