Reputation: 11
We have a small Greenplum cluster in which some queries abort.
System related information:
Greenplum Version: 6.3
Master Host: 1
Segment Host: 2
RAM per Segmenthost: 32GB
SWAP per Segmenthost: 32GB
TOTAL segment: 8 Primary + 0 mirror
segment per host: 4
vm_overcommmit_ratio: 95
gp_vmem_protect_limit: 8072MB
statement_mem: 250MB
The queries are executed with a none superuser.
Symptom:
The query failed with the following error massage:
Canceling query because of high VMEM usage. Used: 7245MB, available 801MB, red zone: 7264MB (runaway_cleaner.c:189)
What we tried:
We calculate the Greenplum Parameter with this information: https://gpdb.docs.pivotal.io/6-3/best_practices/sysconfig.html
This help us for some "simple" queries but for more complicated ones the error happend again.
In the next Step we configured the max_statement_mem: 2000MB
This didn't have any effect to the memory consumption on the segmenthosts. We track this with following Query:
select segid, sum (vmem_mb) from session_state.session_level_memory_consumption
where query like '%<some snippet of the query>%'
group by segid
order by segid;
The memory consumption increases very quickly and the error happend again.
CREATE RESOURCE QUEUE adhoc with (ACTIVE_STATEMENTS=6, MEMORY_LIMIT=6291);
ALTER ROLE user1 RESOURCE QUEUE adhoc;
The Database is set to use the resource queue with the parameter gp_resource_manager: queue
We see in the Table 'gp_toolkit.gp_resqueue_status' when we execute a statement that the 'rsqmemoryvalue' is 1048 but the memory consumption in the session_state.session_level_memory_consumption table shows higher values for the segments until the error occurs again.
Has anyone a tip to fix this problem?
Upvotes: 1
Views: 6353
Reputation: 161
Each query will ask for 250MB memory and you set gp_vmem_protect_limit to 8GB. In this case, you can probably run (8GB- primary process memory)/250MB =~ 20-30 queries at the same time. The size of primary process depends on other settings, shared_buffers, wal_buffers,...
Statement_mem can be set in a session. This means some users can set statement_mem higher (up to max_statement_mem) and you will see less queries in concurrent.
When the memory allocated to those concurrent queries reach 90(OR 95) % of gp_vmem_protect_limit, runaway detector will start to cancel queries to protect primary process from OS OOM Kill.
To "fix" the problem (it is not a problem actually), you can
1) set lower default statement_mem, so you can have more queries running concurrently but slower.
2) increase RAM on segment hosts, such that you can increase gp_vmem_protect_limit.
Upvotes: 0