Chris
Chris

Reputation: 1229

Impala and mem_limit

I heard a "rumour" that Cloudera's set mem_limit=xxx; acts more like a throttle rather than a stop sign. However, my experience with it makes me believe it to be a stop sign: It simply crashes the query with an error rather than making the query more frugal if the query exceeds the memory limit.

Is there any evidence to support the notion that Impala will make a query run longer but slower with less memory to stay below a mem_limit threshold?

Upvotes: 0

Views: 2069

Answers (1)

Tim Armstrong
Tim Armstrong

Reputation: 570

I can confirm the "rumour".I worked on the spill-to-disk support and other memory management in Impala so it does in fact exist and work.

I could be more specific given a version and an example of the query and the error. This has been incrementally improved from release to release as we fixed and improved more and more cases. Impala 3.1 has most of the improvements, but there were significant ones before and after that.

There are some known cases where you will hit a memory limit exceeded even on the latest and greatest. For example, a big cross join will eventually run out of memory.

Upvotes: 2

Related Questions