Reputation: 65510
I am using Postgres 9.4. I have a large table. This is the structure of my table:
processing_date | date |
practice_id | character varying(6) |
chemical_id | character varying(9) |
items | bigint |
cost | double precision |
Indexes:
"vw_idx_chem_by_practice_chem_id" btree (chemical_id)
"vw_idx_chem_by_practice_chem_id_vc" btree (chemical_id varchar_pattern_ops)
"vw_idx_chem_by_practice_joint_id" btree (practice_id, chemical_id)
Now I want to run a LIKE query on the table. This is my query:
EXPLAIN (ANALYSE, BUFFERS) SELECT sum(pr.cost) as actual_cost,
sum(pr.items) as items, pr.practice_id as row_id,
pc.name as row_name, pr.processing_date as date
FROM vw_chemical_summary_by_practice pr
JOIN frontend_practice pc ON pr.practice_id=pc.code
WHERE (pr.chemical_id LIKE '0401%' )
GROUP BY pr.practice_id, pc.code, date
ORDER BY date, pr.practice_id;
And this is the result of an EXPLAIN: http://explain.depesz.com/s/lYRT
As you can see, it's slow partly because it's running a bitmap heap scan on nearly 4 million rows. (The subsequent sort is also slow.)
Is there anything I can do to speed this up?
I'm wondering if I should create a further materialized view, or whether a multicolumn index would help, so that Postgres can look in the index rather than disk.
Is there some way I can also make the sort more efficient?
UPDATE: Here's the definition of the materialized view:
CREATE MATERIALIZED VIEW vw_chemical_summary_by_practice
AS SELECT processing_date, practice_id, chemical_id,
SUM(total_items) AS items, SUM(actual_cost) AS cost
FROM frontend_prescription
GROUP BY processing_date, practice_id, chemical_id
And the underlying table:
id | integer | not null default nextval('frontend_prescription_id_seq'::regclass)
presentation_code | character varying(15) | not null
total_items | integer | not null
actual_cost | double precision | not null
processing_date | date | not null
practice_id | character varying(6) | not null
Indexes:
"frontend_prescription_pkey" PRIMARY KEY, btree (id)
"frontend_prescription_528f368c" btree (processing_date)
"frontend_prescription_6ea07fe3" btree (practice_id)
"frontend_prescription_idx_code" btree (presentation_code varchar_pattern_ops)
"frontend_prescription_idx_date_and_code" btree (processing_date, presentation_code)
UPDATE 2: In case it wasn't clear, I need to get total spending and items by practice and by month, across all chemical_ids that start with `0401.
Upvotes: 2
Views: 2247
Reputation: 4503
-- assuming this is your original table:
CREATE TABLE practice_chemical_old
( processing_date date NOT NULL
, practice_id character varying(6) NOT NULL
, chemical_id character varying(9) NOT NULL
, items bigint NOT NULL DEFAULT NULL
, cost double precision
);
-- create these three new tables to decompose it into
CREATE TABLE practice
( practice_id SERIAL NOT NULL PRIMARY KEY
, practice_name character varying(6) UNIQUE
);
CREATE TABLE chemical
( chemical_id SERIAL NOT NULL PRIMARY KEY
, chemical_name character varying(9) UNIQUE
);
CREATE TABLE practice_chemical_new
( practice_id INTEGER NOT NULL REFERENCES practice (practice_id)
, chemical_id INTEGER NOT NULL REFERENCES chemical (chemical_id)
, processing_date date NOT NULL
, items bigint NOT NULL default 0
, cost double precision
-- Not sure if processing_date should be part of the key, too
, PRIMARY KEY (practice_id, chemical_id)
);
CREATE UNIQUE INDEX ON practice_chemical_new(chemical_id, practice_id);
INSERT INTO practice(practice_name)
SELECT DISTINCT practice_id FROM practice_chemical_old;
INSERT INTO chemical(chemical_name)
SELECT DISTINCT chemical_id FROM practice_chemical_old;
-- now populate the new tables from the old ones ...
INSERT INTO practice_chemical_new(practice_id, chemical_id, processing_date,items,cost)
SELECT p.practice_id, c.chemical_id, pco.processing_date, pco.items, pco.cost
FROM practice_chemical_old pco
JOIN practice p ON p.practice_name = pco.practice_id
JOIN chemical c ON c.chemical_name = pco.chemical_id
;
-- Now, the original table *could* be represented by the following view (or table, or table expression):
CREATE VIEW practice_chemical_fake AS
SELECT pcn.processing_date AS processing_date
, p.practice_name AS practice_id
, c.chemical_name AS chemical_id
, pcn.items AS items
, pcn.cost AS cost
FROM practice_chemical_new pcn
JOIN practice p ON p.practice_id = pcn.practice_id
JOIN chemical c ON c.chemical_id = pcn.chemical_id
;
Note: it is not clear from the original question if multiple instances of {practice,chemical} are possible (with different processing_date). You could need to change the definition of the PK a bit.
Upvotes: 3