user2641043
user2641043

Reputation: 405

Big Table Optimisation

I have a 12million row table, so not enormous, but I want to optimize it for reads as much as possible.

for example currently running

SELECT * 
FROM hp.historicalposition 
WHERE instrumentid = 1167 AND fundid = 'XXX' 
ORDER BY date;

returns 4200 rows and is taking about 4 seconds the first time it is run and 1 second the second time.

What indices might help and and are there any other suggestions?

CREATE TABLE hp.historicalposition
(
  date date NOT NULL,
  fundid character(3) NOT NULL,
  instrumentid integer NOT NULL,
  quantityt0 double precision,
  quantity double precision,
  valuation character varying,
  fxid character varying,
  localt0 double precision,
  localt double precision,
  CONSTRAINT attrib_fund_fk FOREIGN KEY (fundid)
      REFERENCES funds (fundid) MATCH SIMPLE
      ON UPDATE NO ACTION ON DELETE NO ACTION,
  CONSTRAINT attrib_instr_fk FOREIGN KEY (instrumentid)
      REFERENCES instruments (instrumentid) MATCH SIMPLE
      ON UPDATE NO ACTION ON DELETE NO ACTION
)

Upvotes: 0

Views: 86

Answers (3)

Tometzky
Tometzky

Reputation: 23890

  1. You definitely need `instrumentid, fundid` index:
    create index hp.historicalposition_instrumentid_fundid_idx
      on hp.historicalposition(instrumentid,fundid);
    
  2. You can then organize your table data so it's order on the disk physically matches this index:
    cluster hp.historicalposition using hp.historicalposition_instrumentid_fundid_idx;
    

Upvotes: 1

Gordon Linoff
Gordon Linoff

Reputation: 1269873

Here is your query:

SELECT * 
FROM hp.historicalposition 
WHERE instrumentid = 1167 AND fundid = 'XXX' 
ORDER BY date;

The best index is a composite index:

create index idx_historicalposition_instrumentid_fundid_date) on historicalposition(instrumentid, fundid, date);

This satisfies the where clause and can also be used for the order by.

Upvotes: 1

Anonymous
Anonymous

Reputation: 18631

General ideas, not necessarily all applicable to postgresql (in fact, they come from the Oracle world):

  • Partition by time (e.g. day/week/whatever seems most applicable)
  • If there is only one way of accessing the data and the table is of write-once type, then using index organised table could help (a.k.a. clustered index). Also tweak the write settings not to leave any space in the pages written to disk.
  • Consider using compression - to reduce the number of physical reads needed
  • Have a database job that regularly updates the statistics

Upvotes: 0

Related Questions