Reputation: 32014

Why does cache use Most Recently Used (MRU) algorithm as evict policy?

I know the algorithms of MRU and its reversed one Least Recently Used (LRU).

I think LRU is reasonable, as LRU element means it will be used at least possible in future. However, MRU element means the element is very possible to be used in future, why evict it? What is the reasonable scenario?

Upvotes: 52

Answers (8)

azec.me

Reputation: 5078

Lots of contrived arguments here, but MRU has it's place. It may be counter-intuitive, especially if you are learning about all the cache eviction/replacement policies and after LRU (which has most applications with all of its optimizations/variations and is the good approximation to the optimal algorithm, based on the observation that keys that have been heavily used in the last few instructions will probably be heavily used again in the next few).

Here are two examples I could come up with.

Physical World Example

Assume your fridge has some limited capacity (no holes in the fridge and wall behind). Yesterday you have stored leftovers of Couscous from breakfast, leftovers of "Bosanski Lonac" (#5 in list) from the lunch, and some "Shopska salad" from a dinner in the fridge. Today, it is lunch time and you have a mother-in-law (MIL) over as a guest and you have just made nice pasta for lunch. The whole family finished lunch and everyone is happy and there is even some pasta leftovers that you think might be a good lunch to take to work tomorrow. You open the fridge to store away the pasta leftovers and you realize that there is no more space (capacity in the fridge). You are in the rush to spend a quality time with your MIL and first thing you do to make space in the fridge is a quick purge of expired items from the store in your fridge (you look at those "Best Before:" or "Exp:" labels). You end up removing a few sauces from the fridge door that nobody ever eats anyways (TTL-expired items first), but as soon as you are done, you realize that won't help with storing pasta. You are staring at that pan of "Bosanski lonac" leftover from yesterday's lunch (approximately the same size as your pan of pasta) while instantly realizing that your MIL loves it. As every good son-in-law, you offer MIL some leftovers of "Bosanski lonac". She is not quite hungry, but after the second nudge, she agrees to eat half of it, and now that she has agreed, you are not shy anymore to eat the other half too (despite the filling pasta). Plus, you really don't want that "Bosanski lonac" to go bad, it should certainly be gone while it's still somewhat fresh. Win-win. You evict that pan of "Bosanski lonac" from the fridge like it was never there and store the pan of pasta leftovers, and, as every good son-in-law, ask your MIL to warm up "Bosanski lonac" leftovers for both of you. Done. Everyone is happy (unless the fridge decides to break, you know, just because all other appliances did too in the last ~1yr and everything fails all the time).
Video Streaming Example

Imagine you are building a recommendation engine for your video streaming website (think Netflix!) and you suggest your single customer movies from the same genre as the genre of her last watched movie. The cache is limited (there is more movies in that genre than you plan to recommend with or without pagination). If you keep all the movies recommendations of a particular genre for a particular user in a cache, then you should always evict the most recently watched movie from her cache of recommendations as this is not required at least in the near future. MRU eviction policy can be used for such eviction requirements.

For implementation detail, see evict() in MRUCache vs. evict() in LRUCache modules of cacheout Python library. The implementation uses OrderedDict from collections. Cache entries are stored in an OrderedDict so that key ordering based on the cache type can be maintained without the need for additional list(s)/queue(s). Essentially, the key order of the OrderedDict is treated as an "eviction queue" with the convention that entries at the beginning of the queue (head) are "newer" while the entries at the end (tail) are "older". The natural iteration of the items in OrderedDict matches order in which they are added. This is why set() in Cache deletes and then re-inserts item (becomes naturally most-recently accessed in the OrderedDict). For the same reason get() in LRUCache (inherits Cache and overrides get()) moves the cache-hit item to the end of the OrderedDict to become most recently used. Therefore LRUCache doesn't have to do anything more specific on the eviction time than what Cache does in evict() and _popitem() (simply gets next from iterator on OrderDict which returns least recently accessed item).

For MRUCache the only difference is the end from which it gets the item which must be evicted. So it inverts the natural iteration of OrderDict to return most recently accessed item in __next__().

The difference comes down to removing the object being evicted from the head of the queue for MRU (strongest eviction preference is oldest element in OrderedDict) vs. removing the object being evicted from the tail of the queue for LRU (strongest eviction preference is newest element in OrderedDict).

REMEMBER: The convenience is that head of the queue is always the most recently accessed (read or freshly written) object and the tail of the queue is always the least recently accessed (read or freshly written) element. Regardless of the backing data structure that is most efficient for particular language.

Upvotes: 1

Axel Rietschin

Reputation: 683

I think this is just a misunderstanding.

“MRU cache” is correct in describing the cache as a set of most-recently-used items. LRU is the eviction strategy that enables MRU caches, by getting rid of the least-recently-used item when making room for a new item. The same would apply to MFU caches with LFU eviction strategy.

Caches are speculative by nature as the hope is that an item being looked up will be looked up again in the relatively near future (temporal locality) so it is worth holding resources to keep it around for faster/cheaper access on the (necessary) premise that it’s costly/slow to acquire it.

It is easy to come up with any number of scenarios where this is not the case (there are examples in the comments here) but these describe situations where a cache is useless as there is by definition of the problem no chance that an item will be looked up again anytime soon, therefore there is no need for a cache in those scenarios.

A cache is not a free list for theater seats, and a cache is not a pre-filled list for autobus parking lots. These are not caches.

“A cache is like the fridge in your kitchen: you don’t go all the way to the grocery store every time you want a sip of milk.” — Landy Wang

Upvotes: 1

Stephen C

Reputation: 719376

I think the both @Jon Skeet and @Jeremiah Willcock's answers are describing using MRU as a way to avoid polluting the cache with useless entries.

This only works if your cache APIs allow you to change the policy on the fly; e.g. on a per-request basis. Setting your cache policy to MRU in "normal" situations is probably a bad idea ... because your cache becomes ineffective once it fills up.
MRU has the problem that if you get a hit on an entry that is often used in "normal" mode while doing MRU lookups, you end up throwing out the entry ...

Better alternatives to MRU for doing a scan without poluting the cache are:

bypass the cache entirely,
probe the cache without doing a read through / update, and without altering the LRU chains.

For what it is worth, I cannot think of any use-cases for MRU that don't fit this general pattern.

Incidentally, @Jon Skeet's example of buses arrivals is not always borne out in reality due the bunching effect.

If a bus is running late, there are likely to be more than average people waiting at each bus stop. The bus has to stop more frequently, and stays longer at each stop. This slows down the late bus.
A bus that is on time that is following the late bus will typically have fewer people than the average waiting at each bus stop. (Because they just go onto the late bus.) This speeds up the following bus.
The net result is that the buses tend to bunch up.

See: https://en.wikipedia.org/wiki/Bus_bunching

Upvotes: 5

callmeabin

Reputation: 21

I know one scenario MRU is better than LRU. In database cache, assume we have a cache that can contain 50 blocks, and we have 2 tables that exceed the size of cache(let's say 51 blocks). For block nested loop join operation, we need to join rows to the other entire table. But since cache is smaller than the entire table, we need to drop some blocks.

For cache policy LRU, it will drop the first block of the table, and replace it with last block of the table. However, when we continue to join next row with the entire table, we find the first block is missing, then we load the first block into the place of last round second block. And cascading reload all the entire table.

But for MRU, we only need to replace the latest block of the table, so that we can reuse all the blocks loaded before.

Hope this can help you

Upvotes: 2

Kyle Chadha

Reputation: 4161

Perhaps a more tangible example would be a media server. When the user has completed watching a video (let's say it's an episode of a TV show), they are presumably the least likely to want to view it again. So if you must evict something, evict the most recently viewed item.

In practice though, I believe this type of cache is typically used in addition to an LRU or LFU cache, where the two caches in tandem allow you to cover a wide variety of cases.

Upvotes: 10

user3612388

Reputation: 19

Lets say you are caching the seats of a hall for a concert, to expedite the booking. As your application books the seats, remove the cached item from the cache as they are no more required for the booking application.

Upvotes: 1

Jeremiah Willcock

Reputation: 30989

The use case is when you are iterating through the same (larger-than-cache) data multiple times, and so you will not go back to recently accessed data.1

Upvotes: 4

Jon Skeet

Reputation: 1503080

Imagine you were looking up the details of buses as they arrived at a bus stop, based on their bus number (or whatever identifier you use).

It's somewhat reasonable to think that if you've just seen a number 36 bus, you're less likely to see another one imminently than to see one of the other buses that stops there.

Just one example, but the idea is more general: in some cases, having "just seen something" is a good indicator that you're unlikely to see the same thing again soon.

Upvotes: 87

Why does cache use Most Recently Used (MRU) algorithm as evict policy?

Answers (8)

Related Questions