Reputation: 5856
I have the following Query/View:
CREATE OR REPLACE VIEW "SumAndSalesPerCountryYear" AS
SELECT date_part('year'::text, "Invoice"."InvoiceDate") AS year,
"Invoice"."BillingCountry" AS country,
sum("Invoice"."Total") AS total
FROM "Invoice"
GROUP BY date_part('year'::text, "Invoice"."InvoiceDate"), "Invoice"."BillingCountry"
ORDER BY date_part('year'::text, "Invoice"."InvoiceDate") DESC, sum("Invoice"."Total") DESC;
My table structure is as follows:
CREATE TABLE "Invoice"
(
"InvoiceId" integer NOT NULL,
"CustomerId" integer NOT NULL,
"InvoiceDate" timestamp without time zone NOT NULL,
"BillingAddress" character varying(70),
"BillingCity" character varying(40),
"BillingState" character varying(40),
"BillingCountry" character varying(40),
"BillingPostalCode" character varying(10),
"Total" numeric(10,2) NOT NULL,
CONSTRAINT "PK_Invoice" PRIMARY KEY ("InvoiceId"),
CONSTRAINT "FK_InvoiceCustomerId" FOREIGN KEY ("CustomerId")
REFERENCES "Customer" ("CustomerId") MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE NO ACTION
)
WITH (
OIDS=FALSE
);
The current execution plan is
Sort (cost=33.65..34.54 rows=354 width=21) (actual time=0.691..0.698 rows=101 loops=1)"
Sort Key: (date_part('year'::text, "Invoice"."InvoiceDate")), (sum("Invoice"."Total"))
Sort Method: quicksort Memory: 32kB
-> HashAggregate (cost=14.24..18.67 rows=354 width=21) (actual time=0.540..0.567 rows=101 loops=1)
-> Seq Scan on "Invoice" (cost=0.00..11.15 rows=412 width=21) (actual time=0.015..0.216 rows=412 loops=1)
Total runtime: 0.753 ms
My task is to optimize the query by using indices, however i cannot think of a way to use indices for optimizing aggregate results.
Upvotes: 0
Views: 123
Reputation: 45845
You can try to penalize Hashagg by "SET enable_hashagg to OFF", but probably for small data, there will not be any benefit from index .. in this use case - hashagg is usually most fast method for aggregation and sort 32kB is pretty quick.
But .. you are trying do performance benchmark on table with 412 rows. It is nonsense. Any thinking about performance has sense on data with size related 2..3 years of production usage.
Upvotes: 2
Reputation: 78503
As noted by Pavel, Ramfjord, and horse, using an index is of little use with such a tiny amount of data. It's so small that it's faster for Postgres to read disk page or two and process everything in memory.
Further, you have the best possible plan for your query already. You're asking Postgres to compute an aggregate over an entire table and returning it in a certain order. Postgres proceeds by computing the aggregate in memory without bothering to sort the data first, by assigning intermediary results using a hash; it then sorts the small number of results according to your criteria.
Upvotes: 1