Reputation: 11
There is one table in Greenplum of 400 GB's. The following query is taking considerable time while having all free resources on the server (4 data segments)
select max(date_key) from tablex; The table structure is as follows:
Upvotes: 0
Views: 599
Reputation: 291
The master receives, parses, and optimizes the query. The resulting query plan is either parallel or targeted. The master dispatches parallel query plans to all segments, as shown in Figure 1. The master dispatches targeted query plans to a single segment, as shown in Figure 2. Each segment is responsible for executing local database operations on its own set of data. Most database operations—such as table scans, joins, aggregations, and sorts—execute across all segments in parallel. Each operation is performed on a segment database independent of the data stored in the other segment databases. Certain queries may access only data on a single segment, such as single-row INSERT, UPDATE, DELETE, or SELECT operations or queries that filter on the table distribution key column(s). In queries such as these, the query plan is not dispatched to all segments, but is targeted at the segment that contains the affected or relevant row(s).
Understanding Greenplum Query Plans A query plan is the set of operations Greenplum Database will perform to produce the answer to a query. Each node or step in the plan represents a database operation such as a table scan, join, aggregation, or sort. Plans are read and executed from bottom to top. In addition to common database operations such as table scans, joins, and so on, Greenplum Database has an additional operation type called motion. A motion operation involves moving tuples between the segments during query processing. Note that not every query requires a motion. For example, a targeted query plan does not require data to move across the interconnect. To achieve maximum parallelism during query execution, Greenplum divides the work of the query plan into slices. A slice is a portion of the plan that segments can work on independently. A query plan is sliced wherever a motion operation occurs in the plan, with one slice on each side of the motion. For example, consider the following simple query involving a join between two tables: SELECT customer, amount FROM sales JOIN customer USING (cust_id) WHERE dateCol = '04-30-2016';
Link: https://docs.greenplum.org/6-9/admin_guide/query/topics/parallel-proc.html
Upvotes: 0