Reputation: 268
Where is the real cost in greenplum queries? I have a simple sql query which returns in 80ms on SQL Server, and 1,500ms on greenplum!!!
The "explain analyze" shows "Total excute time: 55ms", but that is not true, I tried use odbc dirver or psql client connect to the (greenplum)database, query cost stable at 1,500ms.
Data on each table no more than 10K, partitioned and no skew(No index). All nodes(master/segment) CPU/IO/Mem/metwork average bellow 10 percent.
Questions:
SQL:
SELECT CoreHR_EmploymentRecord.ChangeTypeOID AS CoreHR_EmploymentRecord_ChangeTypeOID,
MAX(CoreHR_EmploymentRecord.ChangeTypeAlias) AS CoreHR_EmploymentRecord_ChangeTypeAlias,
COUNT(DISTINCT CoreHR_EmployeeInformation.UserID) AS CoreHR_EmployeeInformation_Count_UserID
FROM CoreHR_EmployeeInformation WITH (NOLOCK)
LEFT JOIN CoreHR_EmploymentRecord WITH (NOLOCK)
ON CoreHR_EmployeeInformation.TenantId = CoreHR_EmploymentRecord.TenantId
AND CoreHR_EmployeeInformation.UserID = CoreHR_EmploymentRecord.UserID
AND CoreHR_EmploymentRecord.TenantId = 106996
WHERE CoreHR_EmployeeInformation.TenantId = 106996
AND (CoreHR_EmploymentRecord.IsCurrentRecord = 1)
AND (CoreHR_EmploymentRecord.StartDate
BETWEEN '2018-05-24 00:00:00' AND '2018-06-22 23:59:59.998'
)
AND CoreHR_EmployeeInformation.ApprovalStatus IN
(
SELECT * FROM dbo.f_SplitToVarchar(4, ',')
)
GROUP BY CoreHR_EmploymentRecord.ChangeTypeOID
ORDER BY CoreHR_EmploymentRecord.ChangeTypeOID ASC;
Excution Plan(Without index):
"Gather Motion 3:1 (slice4; segments: 3) (cost=0.00..865.47 rows=18 width=18)"
" Merge Key: record.changetypeoid"
" Rows out: 8 rows at destination with 13 ms to end."
" -> GroupAggregate (cost=0.00..865.47 rows=6 width=18)"
" Group By: record.changetypeoid"
" Rows out: Avg 2.7 rows x 3 workers. Max 4 rows (seg0) with 11 ms to end."
" Executor memory: 8K bytes avg, 8K bytes max (seg0)."
" -> Sort (cost=0.00..865.47 rows=6 width=18)"
" Sort Key: record.changetypeoid"
" Sort Method: quicksort Max Memory: 33KB Avg Memory: 33KB (3 segments)"
" Rows out: Avg 6.7 rows x 3 workers. Max 12 rows (seg0) with 11 ms to end."
" Executor memory: 58K bytes avg, 58K bytes max (seg0)."
" Work_mem used: 58K bytes avg, 58K bytes max (seg0). Workfile: (0 spilling)"
" -> Redistribute Motion 3:3 (slice3; segments: 3) (cost=0.00..865.47 rows=6 width=18)"
" Hash Key: record.changetypeoid"
" Rows out: Avg 6.7 rows x 3 workers at destination. Max 12 rows (seg0) with 10 ms to end."
" -> Result (cost=0.00..865.47 rows=6 width=18)"
" Rows out: Avg 6.7 rows x 3 workers. Max 7 rows (seg1) with 7.992 ms to first row, 8.138 ms to end."
" -> GroupAggregate (cost=0.00..865.47 rows=6 width=18)"
" Group By: record.changetypeoid"
" Rows out: Avg 6.7 rows x 3 workers. Max 7 rows (seg1) with 7.990 ms to first row, 8.131 ms to end."
" Executor memory: 248K bytes avg, 258K bytes max (seg1)."
" Work_mem used: 58K bytes avg, 58K bytes max (seg0)."
" -> Sort (cost=0.00..865.47 rows=41 width=18)"
" Sort Key: record.changetypeoid"
" Sort Method: quicksort Max Memory: 49KB Avg Memory: 38KB (3 segments)"
" Rows out: Avg 48.7 rows x 3 workers. Max 63 rows (seg2) with 7.598 ms to first row, 7.603 ms to end."
" Executor memory: 58K bytes avg, 58K bytes max (seg0)."
" Work_mem used: 58K bytes avg, 58K bytes max (seg0). Workfile: (0 spilling)"
" -> Redistribute Motion 3:3 (slice2; segments: 3) (cost=0.00..865.45 rows=41 width=18)"
" Hash Key: info.userid"
" Rows out: Avg 48.7 rows x 3 workers at destination. Max 63 rows (seg2) with 7.178 ms to first row, 7.193 ms to end."
" -> Hash Join (cost=0.00..865.44 rows=41 width=18)"
" Hash Cond: info.tenantid = record.tenantid AND info.userid = record.userid"
" Rows out: Avg 48.7 rows x 3 workers. Max 50 rows (seg1) with 4.240 ms to first row, 4.995 ms to end."
" Executor memory: 7K bytes avg, 7K bytes max (seg0)."
" Work_mem used: 7K bytes avg, 7K bytes max (seg0). Workfile: (0 spilling)"
" (seg1) Hash chain length 1.0 avg, 2 max, using 141 of 131072 buckets."
" -> Dynamic Table Scan on info (dynamic scan id: 1) (cost=0.00..431.94 rows=258 width=10)"
" Filter: tenantid = 106996 AND approvalstatus::text = '4'::text"
" Rows out: Avg 772.7 rows x 3 workers. Max 783 rows (seg2) with 1.668 ms to first row, 1.836 ms to end."
" Partitions scanned: Avg 1.0 (out of 286) x 3 workers. Max 1 parts (seg0)."
" -> Hash (cost=100.00..100.00 rows=34 width=4)"
" Rows in: Avg 146.0 rows x 3 workers. Max 146 rows (seg0) with 5.202 ms to end, start offset by 34 ms."
" -> Partition Selector for info (dynamic scan id: 1) (cost=10.00..100.00 rows=34 width=4)"
" Rows out: Avg 146.0 rows x 3 workers. Max 146 rows (seg0) with 4.725 ms to first row, 5.104 ms to end."
" -> Broadcast Motion 3:3 (slice1; segments: 3) (cost=0.00..433.31 rows=144 width=22)"
" Rows out: Avg 146.0 rows x 3 workers at destination. Max 146 rows (seg0) with 3.820 ms to first row, 3.934 ms to end."
" -> Result (cost=0.00..433.25 rows=48 width=22)"
" Rows out: Avg 48.7 rows x 3 workers. Max 53 rows (seg2) with 4.080 ms to first row, 4.593 ms to end."
" -> Sequence (cost=0.00..433.25 rows=48 width=30)"
" Rows out: Avg 48.7 rows x 3 workers. Max 53 rows (seg2) with 4.078 ms to first row, 4.582 ms to end."
" -> Partition Selector for record (dynamic scan id: 2) (cost=10.00..100.00 rows=34 width=4)"
" Partitions selected: 1 (out of 128)"
" Rows out: 0 rows (seg0) with 0.018 ms to end."
" -> Dynamic Table Scan on record (dynamic scan id: 2) (cost=0.00..433.25 rows=48 width=30)"
" Filter: startdate >= '2018-05-24'::date AND startdate <= '2018-06-22'::date AND tenantid = 106996 AND iscurrentrecord = 1"
" Rows out: Avg 48.7 rows x 3 workers. Max 53 rows (seg2) with 4.060 ms to first row, 4.542 ms to end."
" Partitions scanned: Avg 1.0 (out of 128) x 3 workers. Max 1 parts (seg0)."
"Slice statistics:"
" (slice0) Executor memory: 880K bytes."
" (slice1) Executor memory: 1070K bytes avg x 3 workers, 1070K bytes max (seg0)."
" (slice2) Executor memory: 2844K bytes avg x 3 workers, 2844K bytes max (seg0). Work_mem: 7K bytes max."
" (slice3) Executor memory: 282K bytes avg x 3 workers, 293K bytes max (seg2). Work_mem: 58K bytes max."
" (slice4) Executor memory: 219K bytes avg x 3 workers, 219K bytes max (seg0). Work_mem: 58K bytes max."
"Statement statistics:"
" Memory used: 50176K bytes"
"Optimizer status: PQO version 2.62.0"
"Total runtime: 51.404 ms"
Upvotes: 0
Views: 246
Reputation: 227
Greenplum Database performs very fast sequential scans; indexes use a random seek pattern to locate records on disk.
Greenplum data is distributed across the segments, so each segment scans a smaller portion of the overall data to get the result. With table partitioning, the total data to scan may be even smaller.
Because business intelligence query workloads generally return very large data sets, using indexes is not efficient.
First try your query workload without adding indexes..
Are you using ORCA or the LEGACY PLANNER?
On the timing issue --
Where are you generating the timing from? A remote client? or from the PSQL console? EXPLAIN ANALYZE shows the actual processing time of the query in flight.
You would need to post an EXPLAIN ANALYZE for there to be any real analysis of the plan.
Upvotes: 1