Reputation: 87
I would like to know if spark uses the linux cached memory or the linux used memory when we use the cache/persist method.
I'm asking this because I we have a custer and we see that the machines are used only at 50% used memory and 50% cached memory even when we have long jobs.
Thank you in advance,
Upvotes: 1
Views: 390
Reputation: 1160
Cached/buffered memory is memory that Linux uses for disk caching. When you read a file it is always read into memory cache. You can consider cached memory as free memory. JVM process of spark executor doesn't take directly cached memory. If you see that only 50% of memory is used on your machine, it means that spark executor definitely doesn't take more than 50% of memory. You can use top
or ps
utils to see how much memory spark executor actually takes. Usually it is a little bit more than current size of heap.
Upvotes: 1