user7375496
user7375496

Reputation: 21

Postgres primary key 'less than' operation is slow

Consider the following table

CREATE TABLE COMPANY(
   ID BIGINT PRIMARY KEY     NOT NULL,
   NAME           TEXT    NOT NULL,
   AGE            INT     NOT NULL,
   ADDRESS        CHAR(50),
   SALARY         REAL
);

If we have 100 million random data in this table.

Select age from company where id=2855265

Executed in less than a millisecond

Select age from company where id<353

Return less than 50 rows and Executed in less than a millisecond

Both query uses index

But the following query use full table scan and executed in 3 seconds

Select age from company where id<2855265

Return less than 500 rows

How can I speed up the query that select primary key less than variable?

Upvotes: 2

Views: 1092

Answers (2)

Erwin Brandstetter
Erwin Brandstetter

Reputation: 656982

Performance

The predicate id < 2855265 potentially returns a large percentage of rows in the table. Unless Postgres has information in table statistics to expect only around 500 rows, it might switch from an index scan to a bitmap index scan or even a sequential scan. Explanation:

We would need to see the output from EXPLAIN (ANALYZE, BUFFERS) for your queries.

When you repeat the query, do you get the same performance? There may be caching effects.

Either way, 3 seconds is way to slow for 500 rows, Postgres might be working with outdated or inexact table statistics. Or there may be issues with your server configuration (not enough resources). Or there can be several other not so common reasons, including hardware issues ...

If VACUUM ANALYZE did not help, VACUUM FULL ANALYZE might. It effectively rewrites the whole table and all indexes in pristine condition. Takes an exclusive lock on the table and might conflict with concurrent access!

I would also consider increasing the statistics target for the id column. Instructions:


Table definition?

Whatever else you do, there seem to be various problems with your table definition:

CREATE TABLE COMPANY(
   ID BIGINT PRIMARY KEY NOT NULL,  -- int is probably enough. "id" is a terrible column name
   NAME      TEXT    NOT NULL,  -- "name" is a teriible column name
   AGE       INT     NOT NULL,  -- typically bad idea to store age, store birthday instead
   ADDRESS   CHAR(50),   -- never use char(n)!
   SALARY    REAL        -- why would a company have a salary? never store money as real
);

You probably want something like this instead:

CREATE TABLE emmployee(
   emploee_id    serial PRIMARY KEY
   company_id    int NOT NULL  -- REFERENCES company(company_id)?
 , birthday      date NOT NULL
 , employee_name text NOT NULL
 , address       varchar(50)   -- or just text
 , salary        int           -- store amount as *Cents*
);

Related:

Upvotes: 1

jlee88my
jlee88my

Reputation: 3043

You will need to perform a VACUUM ANALYZE company; to update the planning.

Upvotes: 0

Related Questions